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HUMAN ORPHAN G PROTEIN-COUPLED RECEPTORS 

This patent document claims priority benefit of each of the following applications, 
all filed with the United States Patent and Trademark Office via U.S. Express Mail on the 
5 indicated filing dates: U.S. Provisional Number 60/1 2 1 ,852, filed; February 26, 1 999 
claiming the benefit of U.S. Provisional Number 60/1 09,2 13, filed November 20, 1998; 
U.S. Provisional Number 60/120,416, filed February 16, 1999; U.S. Provisional Number 
60/123,946, filed March 12, 1999; U.S. Provisional Number 60/123,949, filed March 12, 
1 999; U.S. Provisional Number 60/1 36,436, filed May 28, 1 999; U.S. Provisional 
10 Number 60/136,439, filed May 28, 1999; U.S. Provisional Number 60/136,567, filed May 

28, 1999; U.S. Provisional Number 60/137,127, filed May 28, 1999; U.S. Provisional 
Number 60/137,131, filed May 28, 1999; U.S. Provisional Number 141,448, filed June 

29, 1 999 claiming priority from U.S. Provisional Number 60/1 36,437, filed May 28, 
1 999; U.S. Provisional Number (Arena Pharmaceuticals, Inc. docket number 

15 CHN10-1), filed September 29, 1999; U.S. Provisional Number 60/156,333, filed 
September 29, 1 999; U.S. Provisional Number 60/1 56,555 ■ filed September 29, 1 999; 

U.S. Provisional Number 60/156,634, filed September 29, 1999; U.S. Provisional 
. Number (Arena Pharmaceuticals, Inc. docket number RUP6-1), filed October 1, 

1999; U.S. Provisional Number (Arena Pharmaceuticals, Inc. docket number 

20 RUP7-1), filed October 1 , 1999; U.S. Provisional Number (Arena 

Pharmaceuticals, Inc. docket number CHN6-1), filed October 1, 1999; U.S. Provisional 
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Number (Arena Pharmaceuticals. Inc. docket number RUP5-1), filed October 1, 

1999; U.S. Provisional Number (Arena Pharmaceuticals, Inc: docket number 

CHN9-1), filed October 1, 1999. This patent document is related to U.S. Serial Number 
09/170,496 filed October 13, 1998, and U.S. Serial Number unknown (Woodcock 
5 Washburn Kurtz Mackiewicz & NorriSj LLP docket number AREN-0054 ) filed on 

October 12, 1999 (via U.S. Express Mail) both being incorporated herein by reference 
, This patent document also is related to U.S. Serial No. 09/364,425; filed July 30, 1999, 
which is incorporated by reference in its entirety. This application also claims priority 
to U.S. Serial Number (Woodcock. Washburn, Kurtz, Makiewicz & Norris, LLP 
10 docket number AREN-0050), filed on October 12, 1999 (via U.S. Express Mail), 
: incorporated by reference herein in its entirety . Each of the foregoing applications are '"; 
incorporated herein by reference in their entirety. 

> FIELD OF THE INVENTION 
The invention disclosed in this patent document relates to transmembrane receptors, 
15 and more particularly to endogenous, orphan, human G protein-coupled receptors 
("GPCRs"). ; •'" .-. . V.:. • ; , V. ." /■ -\ ■■./;■■' 

BACKGROUND OF THE INVENTION 

Although a number of receptor classes exist in humans, by far the most abundant and 
therapeutically relevant i^ represented by the G protein-coupled receptor (GPCR or GPCRs)" 
20 class. It is estimated that there are some 100,000 genes within the human genome, and of 
these, approximately 2% or 2,000 genes, are estimated to code for GPCRs^ Receptors, 
including GPCRs/for which the endogenous ligand has been identified are referred to as 
"known" receptors, while receptors for which the endogenous ligand has not been identified 
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are referred to as "orphan" receptors. GPCRs represent an important area for the 
development of pharmaceutical products: from approximately 20 of the 1 00 known GPCRs, 
60% of all prescription pharmaceuticals have been developed. This distinction is not merely 
semantic, particularly in the case of GPCRs. Thus, the. orphan GPCRs are to the 
5 pharmaceutical industry what gold was. to California in the late 19* century - an opportunity 
to drive growth, expansion, enhancement and development. 

GPCRs share a common structural motif. All these receptors have seven sequences 
of between 22 to 24 hydrophobic amino, acids that form seven alpha helices, each of which 
spans the membrane (each span is identified by number, i.e., transmembrane- 1 (TM-1), 
10transmebrane-2 (TM-2), etc.). The transmembrane helices are joined by strands of amino 
acids between transmembrane-2 and transmembrane-3, transmembrane-4 and . 
transmembrane-5, and transmembrane-6 and transmembrane-7 on the exterior, or 
"extracellular" side, of the cell membrane (these are referred to as "extracellular" regions 1, 
2 and 3 (EC-1, EC-2 and EC-3), respectively). The transmembrane helices are also joined 
1 5 by strands of amino acids between transmembrane- 1 and transmembrane-2, transmembrane-3 
and transmembrane-4, and transmembrane-5 and transmembrane-6 on the interior, or 
"intracellular" side, of the cell membrane (these are referred to as "intracellular" regions 1, 
2 and 3 (IC-1, IC-2 and IC-3), respectively). The "carboxy" ("C") terminus of the receptor 
lies in the intracellular space within the cell, and the "amino" ("N") terminus of the receptor 
20 lies in the extracellular space outside of the cell. 

Generally, when an endogenous ligand binds with the receptor (often referred to as 
"activation" of the receptor), there is a change in the conformation of the intracellular region 
that allows for coupling between the intracellular region and an intracellular "G-protein." It 
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: has been reported -that GPCRs are "promiscuous" with respect to G proteins, Le., that a 

GPGR can interact with' more than one G protein. See, Kenakin, T., 43 Life Sciences 1 095 
-(1988). Although other G proteins exist, currently, Gq, Gs, Gi, and Go are G proteins that 

have been identified. . Endogenous ligand-activated GPCR coupling with the G-protein 
5 begins a signaling cascade process (referred to as "signal transduction"). Under normal 

conditions, signal transduction ultimately results in cellular activation or cellular inhibition. 

* It is thought that the IC-3 loop as well as the carboxy terminus of the receptor interact with 

the G protein. 

Under physiological conditions, GPCRs exist in the cell membrane in equilibrium 
10 between two different conformations: an "inactive" state and an "active" state. A receptor 
in an inactive state is unable to link to the intracellular signaling transduction pathway to 
produce a biological response. Changing the receptor conformation to the active state allows 
: linkage to the transduction pathway (via the G-protein) and produces a biological response. 

A receptor may be stabilized in an active state by an endogenous ligand or a compound such 
15 as a drug. \\;'-:\\.. " ' . v . ' . • "■" ."• - 

V. ... SUMMARY OF THE INVENTION 

Disclosed herein are human endogenpus orphan G protein-coupled receptors. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figures 1A and IB provide reference "grids" for certain dot-blots provided herein 
2^ {see also, Figure 2 A and 2B, respectively). 

V Figures 2 A and 2B provide reproductions of the results of certain dot-blot analyses 
resulting from hCHN3 and hCHN8, respectively (^ee^o, Figures l A and IB, respectively). 
\ Figure 3 provides a reproduction of the results of RT-PCR analysis of hRUP3. 
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Figure 4 provides a reproduction of the results of RT-PCR analysis of hRUP4. 
Figure 5 provides a reproduction of the results of RT-PCR analysis of hRUP6. 

DETAILED DESCRIPTION 
The scientific literature that has evolved around receptors has adopted a number of 
5 terms to refer to ligands having various effects on receptors. For clarity and consistency, the 
following definitions will be used throughout this patent document. To the extent that these 
definitions conflict with other definitions for these terms, the following definitions shall 
control: 

AMINO ACID ABBREVIATIONS used herein are set out in Table 1 : 
10 TABLE 1 " ~" : ~~ r . — ~" — — 



ALANINE ALA 



A 



ARGININE ARG :. . R 

ASPARAGINE ASN N 



ASPART1CACID ASP 

15 CYSTEINE CYS 

GLUTAMIC ACID GLU 

GLUTAMINE GLN 



GLYCINE GLY G 

HISTIDINE HIS H 

20 ISOLEUCINE ILE ] 

LEUCINE LEU L 

LYSINE LYS K 

METHIONINE . MET M 

PHENYLALANINE PHE p 

P 
S 
T 
W 

Y' 
V 



25 PROLINE pro 

. SERINE SER 

THREONINE JHR 

TRYPTOPHAN TRP 

TYROSINE . • TYR 

30 VALINE VAL 



COMPOSITION means a material comprising at least one component. 

ENDOGENOUS shall mean a material that a mammal naturally produces. 
ENDOGENOUS in reference to, for example and not limitation, the term "receptor," shall 
mean that which is naturally produced by a mammal (for example, and not limitation, a 
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human) or a virus. By contrast, the term NON-ENDOGENOUS in this context shall mean 
that which is not naturally produced by a mammal (for example, and not limitation, a human) 

or.a virus. _• - _ : :' _i;:^ ; . r _^^ , -;:_. ^ 1":. V'.V- _ i-.,- * 

HOST CELL shall mean a cell capable of having a Plasmid and/or Vector 
5 incorporated therein. In the case of a prokaryotic Host Cell, a Plasmid is typically replicated 
as a autonomous molecule as the Host Cell replicates (generally, the Plasmid is thereafter ; 
isolated for introduction into a eukaryotic Host Cell); in the case of a eukaryotic Host Cell, 
a Plasmid is integrated into the cellular DNA of the Host Cell such that when the eukaryotic 
Host Cell replicates, the Plasmid replicates. Preferably, for the purposes of the invention . 
10 disclosed herein, the Host Cell is eukaryotic, more preferably, mammalian, and most 
preferably selected from the group consisting of 293, 293T and COS-7 cells. 

LIGAND shall mean an endogenous, naturally occurring molecule specific for an 
endogenous, naturally occurring receptor. 

NON-ORPHAN RECEPTOR shall mean an endogenous naturally occurring 
lS molecule specific for an endogenous naturally occurring ligand wherein the binding of a 
ligand to a receptor activates an intracellular signaling pathway. . . 

ORPHAN RECEPTOR shall mean an endogenous receptor for which the 
endogenous ligand specific for that receptor has not been identified or is not known. 

-. PLASMID shall mean the combination ofa Vector and cDNA. Generally, a Plasmid 
20 is introduced into a Host Cell for the purposes of replication and/or expression of the cDNA 
as a protein. : . ' 

VECTOR sin reference to cDNA shall mean a circular DNA capable of incorporating 
at least one cDNA and capable of incorporation into a Host Cell 
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The order of the following sections is set forth for presentational efficiency and is 
not intended, nor should be construed, as a limitation on the disclosure or the claims to 
follow. 

Identification of Human GPCRs 

5 The efforts of the Human Genome project have led to the identification of a plethora 

of information regarding nucleic acid sequences located within the human genome; it has 
been the case in this endeavor that genetic sequence information has been made available 
without an understanding or recognition as to whether or not any particular genomic 
sequence does or may contain open-reading frame information that translate human proteins. 

10 Several methods of identifying nucleic acid sequences within the human genome are within 
the purview of those having ordinary skill in the art. For example, and not limitation, a 
variety of GPCRs, disclosed herein, were discovered by reviewing the GenBank™ database, 
while other GPCRs were discovered by utilizing a nucleic acid sequence of a GPCR, 
previously sequenced, to conduct a BLAST™ search of the EST database. Table A, below, 

1 5 lists the disclosed endogenous orphan GPCRs along with a GPCR's respective homologous 
GPCR: 

Per Cent Reference To 

Homology Homologous 
To Designated GPCR 
GPCR (Accession No.) 

52.3%LPA-R U92642 
36%P2Y5 , . AF000546 



Disclosed 
Human 
20 Orphan 
GPCRs 

hARE-3 
hARE-4 



Accession 
Number 
Identified 

AL033379 
AC006087 



TABLE A 
Open Reading 

Frame 
(Base Pairs) 

1,260 bp 
1,1 19 bp . 



10 



15 
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hARE-5 



hGPR27 
-hARE-1- 



hARE-2 

hPPRl 

hG2A 

hRUP3 



hRUP4 



hRUP5 



hRUP6 
HRUP7 
hCHN3 
hCHN4 
hCHN6 
hCHN8 



hCHN9 
hCHNIO 



AC006255 : .': 



AA775870 
:-A10?0920: 



AA359504 

H67224 
AA754702 
AL035423 



AI307658 



AC005849 



AC005871 
AC007922 
EST 36581 
AA804531 
EST 2 134670 
EST 764455 



EST 1541536 
EST 1365839 



-8- ;. 

1,104 bp 



1,128 bp 
- 999 bp - 



1,122 bp 
1,053 bp 
1,1 13 bp 
1,005 bp 



32% Oryzids 
latipes 

- -43%- ; - 

KIAA0001 
53% GPR27 
; 39% EBI1 
31%GPR4 
30% 
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D43633 



D13626" 



L31581 
L36148 
2133653 





Drosophila 


••••• :>'..•-._>.••';,: 


1,296 bp 


. meianogaster 




32%pNPGPR • 


NP_004876 




28% and 29% 


AAC41276.. 




Zebra fish Ya 


and 




and Yb, 


AAB94616 




1,413 bp 


: respectively 




: 25%DEZ 


• Q99788 




23% FMLPR . 


P21462 


1,245 bp . 


48% GPR66 


NP_006047 


1,173 bp 


43%H3R 


AF140538 


1,1 13 bp 


53%GPR27 




1,077 bp 


32% thrombin 


4503637 


1,503 bp - 


36% edg-1 


NP 001391 


1,029 bp 


; 47% 


D13626 




KIAA0001 




1,077 bp 


41% LTB4R 


NM_000752 


1,055 bp 


35% P2Y 


NM 002563 



R ; e ^ e ? 0r ^^o'ogy is useful in terms, of gaining. an appreciation of a role of the 
. disclosed receptors within the human body. Additionally, such homology can provide insight 
20 as to possible endogenous ligand(s) that may be natural activators for the disclosed orphan 



GPCRs. 

B. Receptor Screening 

Techniques have become 



more readily available over the past few years for 
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endogenous-ligand identification (this, primarily, for the purpose of providing a means of 
conducting receptor-binding assays that require a receptor's endogenous ligand) because the 
traditional study of receptors has always proceeded from the apriori assumption (historically 
based) that the endogenous ligand must first be identified before discovery could proceed to 
5 find antagonists and other molecules that could affect the receptor. Even in cases where an 
antagonist might have been known first, the search immediately extended to looking for the 
endogenous ligand. This mode of thinking has persisted in receptor research even after the 
discovery of constitutively activated receptors. What has not been heretofore recognized is 
that it is the active state of the receptor that is most useful for discovering agonists, partial 
10 agonists, and inverse agonists of the receptor. For those diseases which result from an overly 
active receptor or an under-active receptor, what is desired in a therapeutic drug is a 
compound which acts to diminish the active state of a receptor or enhance the activity of the 
receptor, respectively, not necessarily a drug which is an antagonist to the endogenous ligand: 
This is because a compound that reduces or enhances the activity of the active receptor state 
15 need not bind at the same site as the endogenous ligand. Thus, as taught by a method of this 
invention, any search for therapeutic, compounds should start by screening compounds 
against the ligand-independent active state. 

As is known in the art, GPCRs can be "active" in their endogenous state even without 
the binding of the receptor's endogenous ligand thereto. Such naturally-active receptors can 
20 be screened for the direct identification (/. e., without the need for the receptor's endogenous 
ligand) of, in particular, inverse agonists. Alternatively, the receptor can be "activated" via, 
e.g., mutation of the receptor to establish a non-endogenous version of the receptor that is 
active in the absence of the receptor's endogenous ligand. 
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Screening candidate compounds against an endogenous or non-endogenous, 
constitutively activated version of the human orphan GPCRs disclosed herein can provide 
' for the direct identification of candidate compounds which'acrat this cell surface receptor, 
... without requiring use of the receptor's endogenous ligand. By determining areas within 
5 the body where the endogenous version of human GPCRs disclosed herein is expressed 
and/or over-expressed, it is possible to determine related disease/disorder states which are 
associated with the expression and/or over-expression of the receptor; such an approach is 
. disclosed in this patent document 

With respect to creation of a mutation that may evidence constitutive activation of 
10 human orphan GPCRs disclosed herein is based upon the distance from the proline residue 
: at which is presumed to be located within TM6 of the GPCR typically nears the TM6/IC3 
interface (such proline residue appears to be quite conserved). By mutating the amino acid 
residue located 16 amino acid residues from this residue (presumably located in the IC3 
region of the receptor) to, most preferably, a lysine residue, such activation may be obtained. 
15 Other amino acid residues may be useful in the mutation at this position to achieve this 
objective. • : 

C Disease/Disorder Identification and/or Selection 

Preferably, the DNA sequence of the human orphan GPCR can be used to make a 
probe for (a) dot-blot analysis against tissue-mRNA, and/or (b) RT-PCR identification of 
, 20 the expression of the receptor in tissue samples. The presence of a receptor in a tissue 
source, or a diseased tissue, or the presence of the receptor at elevated concentrations in 
diseased tissue compared to a normal tissue, can be preferably utilized to identify a 
correlation with a treatment regimen, including but not limited to, a disease associated 
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with that disease. Receptors can equally well be localized to regions of organs by this 
technique. Based on the known functions of the specific tissues to which the receptor is 
localized, the putative functional role of the receptor can be deduced. 
D. Screening of Candidate Compounds 
5 1. Generic GPCR screening assay techniques 

When a G protein receptor becornes constitutively active (i.e., active in the absence 
of endogenous ligand binding thereto), it binds to a G protein (e.g., Gq, Gs, Gi, Go) and 
. stimulates the binding of GTP to the G protein. The G protein then acts as a GTPase and 
slowly hydrolyzes the GTP to GDP, whereby the receptor, under normal conditions, becomes 
10 deactivated. However, constitutively activated receptors continue to exchange GDP to GTP. 
A non-hydroiyzable analog of GTP, ["SJGTPyS, can be used to monitor enhanced binding 
to membranes which express constitutively activated receptors. . It is reported that 
[ 35 S]GTPyS can be used to monitor G protein coupling to membranes in the absence and. 
presence of ligand. An example of this monitoring, among other examples well-known and 
15 available to those in the art, was reported by Traynor and Nahorski in 1 995. The preferred 
use of this assay system is for initial screening of candidate compounds because the system 
is generically applicable to all G protein-coupled receptors regardless of the particular G 
protein that interacts with the intracellular domain of the receptor. 
2« Specific GPCR screening assay techniques 
20 Once candidate compounds are identified using the "generic" G protein-coupled 

receptor assay (i.e., an assay to select compounds that are agonists, partial agonists, or inverse 
agonists), further screening to confirm that the compounds have interacted at the receptor site 
is preferred. For example, a compound identified by the "generic" assay may not bind to the 
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receptor, but may instead merely "uncouple" the G protein from the intracellular domain. 
<l Gs and Gi. 

Gs stimulates the enzyme adenylyl cyclase. Gi (and Go), on the other hand, inhibit 
this enzyme. Adenylyl cyclase catalyzes the conversion of ATP to c AMP; thus, 

5 constitutively activated GPCRs that couple the Gs protein are associated with increased 
> cellular levels of cAMP. On the other hand, constitutively activated GPCRs that couple the 
Gi (or Go) protein are associated with decreased cellular levels of cAMP. See, generally, 
"Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3 rd Ed.) 
Nichols, J.G. et al eds. Sinauer Associates, Inc. (1 992). Thus, assays that detect cAMP can 

10 be utilized to determine if a candidate compound is, e.g., an inverse agonist to the receptor 
(i.e., such a compound would decrease the levels of cAMP). A variety of approaches known 
in the art for measuring cAMP can be utilized; a most preferred approach relies upon the use 
of anti-cAMP antibodies in an ELISA-based format. Another type of assay that can be 
. utilized is a whole cell second messenger reporter system assay. Promoters on genes drive 

15 the expression of the proteins that a particular gene encodes. Cyclic AMP drives gene 
expression by promoting the binding of a cAMP -responsive DNA binding protein or 
transcription factor (CREB) which then binds to the promoter at specific sites called cAMP 
response elements and drives the expression of the gene. Reporter systems can be constructed 
which have a promoter containing multiple cAMP response elements before the reporter 

20 gene, e.g., P-galactosidase or luciferase. Thus, a constitutively activated Gs-linked receptor 
causes the accumulation of cAMP that then activates the gene and expression of the reporter 
protein. The reporter protein such as P-galactosidase or luciferase can then be detected using 
standard biochemical assays (Chen et al. 1995). 
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Go and Gq. 

Gq and Go are associated with activation of the enzyme phospholipase C, Which in 
turn hydrolyzes the phospholipid PIP 2 , releasing two intracellular messengers: 
5 diacycloglycerol (DAG) and inistol 1,4,5-triphoisphate (IP 3 ). Increased accumulation of IP 3 
is associated with activation of Gq- and Go-associated receptors. See, generally, "Indirect 
Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3 rf . Ed.) Nichols, 
J.G. et al eds. Sinauer Associates, Inc. (1992), Assays that detect IP 3 accumulation can be 
utilized to determine if a candidate compound is, e.g., an inverse agonist to a Gq- or Go- 
10 associated receptor {i.e., such a compound would decrease the levels of IP 3 ). Gq-associated 
receptors can also been examined using an API reporter assay in that Gq-dependent 
phospholipase C causes activation of genes containing API elements; thus, activated: Gq- 
associated receptors will evidence an increase in the expression of such genes, whereby 
inverse agonists thereto will evidence a decrease in such expression, and agonists will 
15 evidence an increase in such expression. Commercially available assays for such detection 
are available. 

3. GPCR Fusion Protein 

The use of an endogenous, constitutively activated orphan GPCR, or a non- 
endogenous, constitutively activated orphan GPCR, for screening of candidate compounds 
20 for the direct identification of inverse agonists, agonists and partial, agonists provides a 
unique challenge in that, by definition, the receptor is active even in the absence of an 
endogenous ligand bound thereto. Thus, it is often useful that an approach be utilized that 
can enhance the signal obtained by the activated receptor. A preferred. approach is the use 
of a GPCR Fusion Protein. 



Generally, once it is determined that a GPCR is or has been constitutively activated, . 
using the assay techniques set forth above (as well as others), it is possible to determine the 

- predominant G protein that co^ Coupling of the G protein 

to the GPCR provides a signaling pathway that can be assessed. Because it is most preferred - 

5 that screening take place by use of a mammalian expression system, such a system will be 
expected to have endogenous G protein therein. Thus, by definition, in such a system, the 
constitutively activated orphan GPCR will continuously signal. In this regard, it is preferred 
that this signal be enhanced such that in the presence of, e.g., an inverse agonist to the 
receptor, it is more likely that it will be able to more readily differentiate, particularly in the 

1 0 context of screening, between the receptor when it is contacted with the inverse agonist. 

The GPCR Fusion Protein is intended to enhance the efficacy of G protein coupling 
with the GPCR. The GPCR Fusion Protein is preferred for screening with a non- 
endogenous, constitutively activa^ GPCR because such an approach increases the signal 
that is most preferably utilized in such screening techniques, although the GPCR Fusion 

15 Protein can also be (and preferably is) used with an endogenous, constitutively activated 
GPCR. This is important in facilitating 

ratio is import preferred for the screening of candidate compounds as disclosed herein. 

The construction of a construct useful for expression of a GPCR Fusion Protein is 
within the purview of those having ordinary skill in the art. Commercially available 
20 expression vectors and systems offer a variety of approaches that can fit the particular needs 
r of an investigator. The criteria of importance for such a GPCR Fusion Protein construct is 
that the GPCR sequence and the G protein sequence both be in-frame (preferably, the 
: sequence for the GPCR is upstream of the G protein sequence) and that the "stop" codbn of 
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the GPCR must be deleted or replaced such that upon expression of the GPCR, the G protein 
can also be expressed. The GPCR can be linked directly to the G protein, or there can be 
spacer residues between the two (preferably, no more than about 12, although this number 
can be readily ascertained by one of ordinary skill in the art). We have a preference (based 
5 upon convenience) of use of a spacer in that some restriction sites that are not used will, 
effectively, upon expression, becomea spacer. Most preferably, the G protein that couples 
to the GPCR will have been identified prior to the creation of the GPCR Fusion Protein 
construct. Because there are only a few G proteins that have been identified, it is preferred 
that a construct comprising the sequence of the G protein (i.e., a universal G protein 
10 construct) be available for insertion of an endogenous GPCR sequence therein; this provides 
for efficiency in the context of large-scale screening of a variety of different endogenous 
GPCRs having different sequences. 
E. Other Utility 

Although a preferred use of the human orphan GPCRs disclosed herein may be for 
1 5 the direct identification of candidate compounds as inverse agonists, agonists or partial 
agonists (preferably for use as pharmaceutical agents), these versions of human GPCRs can 
also be utilized in research settings. For example, in vitro and in vivo systems incorporating 
GPCRs can be utilized to further elucidate and understand the roles these receptors play in 
the human condition, both normal and diseased, as well as understanding the role of 
^ constitutive activation as it applies to understanding the signaling cascade. The value in 
human orphan GPCRs is that its utility as a research tool is enhanced in that by determining 
the location(s) of such receptors within the body, the. GPCRs can be used to understand the 
role of these receptors in the human body before the endogenous ligand therefor is identified. 



Other uses of the disclosed receptors will become apparent to those in the art based upon, 

inter alia, a review of this patent document 
. ■ r / i i EXAMPLES , ; . , / -1 ' T ' : . r - _ 1 - ^. 

The following examples are presented for purposes of elucidation, and not limitation, 
5 of the present invention. While specific nucleic acid and amino acid sequences are disclosed 

herein, those of ordinary skill in the art are credited with the ability to make minor 
: modifications- to these sequences while achieving the same or substantially similar results 
. reported below. Unless otherwise indicated below, all nucleic acid sequences for the. 

disclosed endogenous orphan human GPCRs have been sequenced and verified. For 
1 0 purposes of equivalent receptors, those of ordinary skill in the art will readily appreciate that 

conservative substitutions can be made to the disclosed sequences to obtain a functionally 

equivalent receptor. 

Example 1 

Endogenous Human Gpcrs 
15 1. Identification of Human GPCRs 

Several of the disclosed endogenous human GPCRs were identified based upon a 
review of the GenBank database information. While searching the database, the following 
cDNA clones were identified as evidenced below. _ 

Disclosed Accession Complete DNA Open Reading Nucleic Acid Amino 

20 Human Number Sequence Frame SEQ.ID. Acid 

Orphan (Base Pairs) (Base Pairs) NO. SEQ.ID. 
GPCRs NO. 
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hARE-3 AL033379 11 1,389 bp 



hARE-4 AC006087 226,925 bp 



hARE-S AC006255 127,605 bp 



HRUP3 AL035423 140,094 bp 



5 hRUP5 AC005849 169,144 bp 



hRUP6 - AC00587I 218,807 bp 



hRUP7 AC007922 158,858 bp 



1,260 bp 
1,1 19 bp 
1,104 bp 
1,005 bp 
1,413 bp 
1,245 bp 
1,173 bp 



v 9 



11 



13 



10 



12 



14 



Other disclosed endogenous human GPCRs were identified by conducting a BLAST 
search of EST database (dbest) using the following EST clones as query sequences. The 
10 following EST clones identified were then used as a probe to screen a human genomic 
library. 



Disclosed Query EST Clone/ 

Human (Sequence) Accession No. 



Orphan 

15 GPCRs 

HGPCR27 Mouse 

GPCR27 
hARE-1 TDAG 



hARE-2 GPCR27 



hPPRl Bovine 



Identified 



AA775870 



1689643 

A1090920 
68530 

AA359504 
238667 



Open Nucleic Acid Amino Acid 
Reading SEQ.ID.NO. SEQ.ID.NO. 
Frame 



(Base Pairs) 
1,125 bp 



999 bp 
1,122 bp 
1,053 bp 



20 hG2A 



PPR1 H67224 
• Mouse See Example 2(a), 1,113 bp 



15 



17 



19 



21 



23 



16 



18 



20 



22 



24 



1179426 



below 
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hCHN3 



hCHN4 



hCHN6 



NA. 



TDAG 



N.A. 



hCHN8 KIAA0001 
hCHN 9 1365839 



EST 36581 . 

(fulHength) . 
1184934 

AA804531 
EST 2134670 

(full length). 
EST 764455 
EST 154 1536 



hCHlSlO Mouse EST • Human 1365839 

: - .1365839 : 

• HRUP4 N.A. . A1307658 

N.A. = "not applicable". 



: 1,1 13 bp 
1,077 bp 

" W3 b p^ 



1,029 bp 
1,077 bp 
1,005 bp 



1,296 bp 



25 



27 



29 



31 
33 
35 



.37 
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26 



28 



32 
34 
36 



38 



10 



2. Full Length Cloning 
a. hG2A (Seq. Id. Nos. 23 & 24) 

' Mouse EST clone 1 1 79426 was used to obtain a human genomic clone containing all 
but three amino acid hG2A coding sequences. The 5'end of this coding sequence was . 
obtained by using 5 'RACE™, and the template for PCR was Clontech's Human Spleen 
Marathon-ready™ cDNA. The disclosed human G2A was amplified by PCR using the G2A 
15 cDNA specific primers for the first and second round PCR as shown in SEQ.ID.NO.: 39 and 
SEQ.ID.NO.:40 as follows: V 

5'-CTGTGTACAGCAGTTCGCAGAGTG-3' (SEQ.ID.NO.: 39; 1" round PGR) 
5'-GAGTGCGAGGCAGAGCAGGTAGAC-3' (SEQ.ID.NO.: 40; second round PCR). 
PCR was performed using Advantage™ GC Polymerase Kit (Clontech; manufacturing 
20 instructions will be followed), at 94°C for 30 sec followed by 5 cycles of 94°C for 5 sec and 
72°C for 4 min; and 30 cycles of 94° for 5 sec and 70° for 4 min. An approximate 1 .3 Kb 
PCR fragment was purified from agarose gel, digested with Hind III arid Xba I and cloned 
into the expression vector pRC/CMV 2 (invitrogen)l The cloned-insert was sequenced using 
the T7 Sequenase™ kit (USB Amersham; manufacturer instructions will be followed) and 
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the sequence was compared with the presented sequence. Expression of the human G2 A will 
be detected by probing an RNA dot blot (Clontech; manufacturer instructions will be 
followed) with the P 3 Mabeled fragment 

b. hCHN9 (Seq Id. Nos. 33 & 34) 
5 Sequencing of the EST clone 1541536 indicated that hCHN9 is a partial cDNA 

clone having only an initiation codon; i.e., the termination codon was missing. When 
hCHN9 was used to "blast" against the data base (nr), the 3' sequence of hCHN9 was 
1 00% homologous to the 5 ' untranslated region of the leukotriene B4 receptor cDNA, 
which contained a termination codori in the frame with hCHN9 coding sequence To 
10 determine whether the 5' untranslated region of LTB4R cDNA was the 3' sequence of 
hCHN9, PCR was performed using primers based upon the 5 ' sequence flanking the . . 
initiation codon found in hCHN9 and the 3' sequence around the termination codon found . 
in the LTB4R5' untranslated region. The 5' primer sequence utilized was as follows: 
5^CCCGAATTCCTGCTTGCTCCCAGCTTGGCGC-3' (SEQ.ID.NO.: 41; sense) and 
15 5 ! -TGTGGATCeTGGTGTCAAAGGTCCCATTCCGG-3' (SEQ.ID.NO.: 42; antisense). 
PCR was performed using thymus cDNA as a template and rTth polymerase (Perkin Elmer) 
with the buffer system provided by the manufacturer, 0.25 uM of each primer, and 0.2 mM 
of each 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 min, 65°C for 1 min 
and 72 °C for 1 min and 1 0 sec. A 1 . 1 kb fragment consistent with the predicted size was 
20 obtained from PCR. This PCR fragment was subcloned into pCMV (see below) and 
sequenced (see, SEQ.ID.NO.: 33). 

c. hRUP 4 (Seq. Id. Nos. 37 & 38) 
The full length hRUP4 was cloned by RT-PCR with human brain cDNA (Clontech) 
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5 -TCACAATGCTAGGTGTGGTC-3' (SEQ.ID.NO.: 43; sense) and 
5 -TGCATAGACAATGGGATTACAG-3' (SEQ.1D.NO,: 44; antisense). 

PCR was performed using TaqPlus™ Precision™ polymerase (Stratagene; manufacturing 
5 i r hstructions wilfbe followed) by the following cycles: 94°C for 2 min; 94°C 30 sec; 55°C 

for 30 sec, 72°C for 45 sec, and 72°C for 10 min. Cycles 2 through 4 were repeated 30 
••.times. : ; . '/\ ' 

, The PCR products were separated on a 1 % agarose gel and a 500 bp PCR fragment 

was isolated and cloned into the pCRII-TOPO vector (Invitrogen) and sequenced using the 
10T7 DNA Sequenase™ kit (Amsham) and the SP6/T7 primers (Stratagene). Sequence 

analysis revealed that the PCR fragment was indeed an alternatively spliced form of 

AI307658 having a continuous open reading frame with similarity to other GPCRs. The 

completed sequence of this PCR fragment was as follows: 

s'-tcaCaatgctaggtgtggtctggctggtggcagtgatcgtaggatcacCcatgtggcac 
1 5 gtgca acaacttg agatcaaatatg acttcctatatg a aa aggaacacatctgctgcttagaa 
gagtggaccagccctgtgcaccagaagatctacaccaccttcatccttgtcatcctcttcctcc 
tgcctcttatggtgatgcttattctgtacgtaaaattggttatgaactttggataaagaaaaga 
gttggggatggttcagtgcttcgaactattcatggaaaagaaatgtccaaaataggcaggaag 
aagaaacg'agctgtcattatgatggtgacagtggtggctctctttgctgtgtgctgggcacca 
20ttccatgttgtccatatgatgattgaatacagtaattttgaaaaggaatatgatgatgtgaca 
atcaagatgatttttgctatcgtgcaaattattggattttccaactccatctgtaatcccattg 
tctatgca-3' (seq.id.no.: 45) 

Based on the above sequence, two sense oligonucleotide primer sets: 

5*-CTGCTTAGAAGAGTGGACCAG-3' (SEQ.ID.NO.: 46; oligo 1), 

25 5*-CTGTGCACCAGAAGATCTACAC-3' (SEQ.IDNO.: 47; oligo 2) 

and two antisense oligonucleotide primer sets: 

5'-CAAGGATGAAGGTGGTGTAGA-3' (SEQ.ID.NO.: 48; oligo 3) 
5'-GTGTAGATCTTCTGGTGCACAGG-3' (SEQ.ID.NO.: 49; oligo4) 

were used for 3'- and 5 '-race : PCR'with a human brain Marathon-Ready™ cDNA (Clontech, 
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Cat# 7400-1) as template, according to manufacture's instructions. DNA fragments 
generated by the RACE PCR were cloned into the pCRII-TOPO™ vector (Invitrogen) and 
sequenced using the SP6/T7 primers (Stratagene) and some internal primers. The 3' RACE 
product contained a poly(A) tail and a completed open reading frame ending at a TAA stop 
5 codon. The 5' RACE product contained an incomplete 5' end; i.e., the ATG initiation codon 
was not present. 

Based on the new 5' sequence, oligo 3 and the following primer: 
5'-GCAATGCAGGTCATAGTGAGC -3' (SEQ.ID.NO.: 50; oligo 5) 

were used for the second round of 5' RACE PCR and the PCR products were analyzed as 
10 above. A third round of 5' RACE PCR was carried out utilizing antisense primers: 

5•-TGGAGCATGGTGACGGGAATGCAGAAG-3♦ (SEQ.ID.NO.; 51; oligo 6) and 
5'-GTGATGAGCAGGTCACTGAGCGCCAAG-3' (SEQ.ID.NO.: 52; o!igo7). 
The sequence of the 5' RACE PCR products revealed the presence of the initiation codon 
ATG, and further round of 5' RACE PCR did not generate any more 5' sequence. The 

15 completed 5' sequence was confirmed by RT-PCR using sense primer 
5'-GCAATGCAGGCGCTTAACATTAC-3' (SEQ.ID.NO.: 53; oligo 8) 

and oligo 4 as primers and sequence analysis of the 650 bp PCR product .generated from 
human brain and heart cDNA templates (Clontech, Cat# 7404-1). The completed 3' 
sequence was confirmed by RT-PCR using oligo 2 and the following antisense primer: 
20 5'-TTGGGTTACAATCTGAAGGGCA-3' (SEQ.ID.NO : 54; oligo 9) ' 
and sequence analysis of the 670 bp PCR product generated from human brain and heart 
cDNA templates. (Clontech, Cat# 7404-1). . . . 
d. hRUP5 (Seq. Id. Nos. 9 & 10) 
The full length hRUP5 was cloned by RT-PCR using a sense primer upstream from 
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ATG, the initiation codon (SEQ.ID.NO.: 55),'and an antisense primer containing TCA as the 
stop codon (SEQ.ID.NO.: 56), which had the following sequences: 

S'-ACTCCGTGTCCAGCAGGACTCTGO^SEQ.ID.NO.iSS) 
5 '-TGCGTGTTCCTGG ACCCTCACGTG-3 ' (SEQ.ID.NO.: 56) 

■ 5 and human peripheral leukocyte cDNA (Clontech) as a template. Advantage cDNA 
polymerase (Clontech) was used for the amplification in a 50ul reaction by the following 
cycle with step 2 through step 4 repeated 30 times: 94 °C for 30 sec; 94° for 15 sec; 69° for 
40 sec; 72°C for 3 min; and 72°C fro 6 min. A 1.4kb PGR fragment was isolated and cloned 
with the pCRII-TOPO™ vector (Invitrogen) and completely sequenced using the T7 DNA 

10 Sequenase™ kit (Amsham). See, SEQ.ID.NO.: 9. 

e. hRUP6 (Seq. Id. Nos. 11 & 12) 

The full length hRUP6 was cloned by RT-PCR using primers: 

5'-CAGGCCTTGGATTTTAATGTCAGGGATGG-3' (SEQ.ID.NO.: 57) and 

5'-GGAGAGTCAGCTCTGAAAGAATTCAGG-3' (SEQ ID.NO.: 58); 
15 and human thymus Marathon-Ready™ cDNA (Clontech) as a template. Advantage cDNA 

polymerase (Clontech, according to manufacturer's instructions) was used for the 

amplification in a 50ul reaction by the following cycle: 94 °C for 30sec; 94 °C for 5 sec; 66 °C 

for 40sec; 72°C for 2.5 sec and 72°C for 7 min. Cycles 2 through 4 were repeated 30 times. 

A 1 .3 Kb PCR fragment was isolated and cloned into the pCRII-TOPO™ vector (Invitrogen) 
20 and completely sequenced (see, SEQ.ID.NO.: 11) using the ABI Big Dye Terminator™ kit 

(P.E. Biosystem). 

f. hRUP7 (Seq. Id. Nos. 13 & 14) 

The full length RUP7 was cloned by. RT-PCR using primers: . • . 

5'-TGATGTGATGCCAGATACTAATAGCAC-3' (SEQ.ID.NO.: 59; sense) and v ;w-.-- : 



WO 00/31258 PCT/US99/23687 
' " • • , . ' '-*■ . -23- '. 
5'-CCTGATTCATTTAGGTGAGATTGAGAC-3' (SEQ.ID.NO.: 60; antisense). 
and human peripheral leukocyte cDNA (Clontech) as. a template.. Advantage™ cDNA 
polymerase (Clontech) was used for the amplification in a 50 ul reaction by the following 
cycle with step 2 to step 4 repeated 30 times: 94 °C for 2 minutes; 94°C for 15 seconds; 60°C 
5 for 20 seconds; 72°C for2 minutes; 72°C for 10 minutes. A 1.25 Kb PCR fragment was 
isolated and cloned into the pCRII-TOPO™ vector (Invitrogen) and completely sequenced 
using the ABI Big Dye Terminator™ kit (P.E. Biosystem). See, SEQ .ID .NO.: 13. 
g. hARE-5 (Seq. Id Nos.5&6) 
The full length hARE-5 was cloned by PCR using the hARE5 specific primers 
10 5'-CAGCGCAGGGTGAAGCCTGAGAGC-3' SEQ.ID.NO.: 69 (sense, 5' of initiation codon ATG) 
and 5'-GGCACCT GCTGTGACCTGTGCAGG-3* SEQ.ID.NO.:70 (antisense, 3 ' of stop codon TGA) 
and human genomic DNA as template. TaqPlus Precision™ DNA polymerase (Stratagene) 
was used for the amplification by the following cycle with step 2 to step 4 repeated 35 times: 
96°C, 2 minutes; 96°C, 20 seconds; 58°C, 30 seconds; 72°C, 2 minutes; and 72°C, 1 0 minutes 
15 A 1.1 Kb PCR fragment of predicated size was isolated and cloned into the 

pCRII-TOPO™ vector (Invitrogen) and completely sequenced (SEQ.ID.NO.:5) using the T7 
DNA Sequenase™ kit (Amsharn)/ 

h. hARE-4 (Seq. Id; Nos : 3 & 4) 
The full length hARE-4 was cloned by PCR using the hARE-4 specific primers 5'- 
20 CTGGTGTGCTCCATGGCATCCC-3' SEQ.ID.NO.:67 (sense, 5' of initiation codon ATG) and 5'- 
GTA AGCCTCCCAGAACG AG AGG-3 ' SEQ.ID.NO.: 68 (antisense, 3] of stop codon TGA) and 
human genomic DNA as template. Taq DNA polymerase (Stratagene) and 5% DMSO was 
used for the amplification by the following cycle with step 2 to step 3 repeated 35 times: 
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94°C, 3 minutes; 94°G; 30 seconds; 59°C, 2 minutes; 72°C, 10 minutes ' r ' . 

A 1 .12 Kb PCR fragment of predicated size was isolated and cloned into the pCRII- 
TOPO i ; M vector (Invitrogen) and completely sequenced (SEQ.ID.NO.:3) using the T7 DNA 
Sequenase™ kit (Amsham). j - 

5 i. hARE-3 (Seq.Id.Nos.: 1 & 2) 

The full length hARE-3 was cloned by PCR using the hARE-3 specific primers 5'- 
gatcaagcttCCATCCTACTG A AACCATGGTC-3 ' SEQ.ID.NO.:65 (sense, lower case nucleotides 
represent Hind III overhang, ATG as initiation codon) and 5*-, 
gatcagatctCAGTTCCAATATTCACACCACCGTC-3* SEQ.ID.NO,:66 (antisense, lower case 
10 nucleotides represent Xba I overhang, TCA as stop codon) and human genomic DNA as 
template. TaqPlus Precision™ DNA polymerase (Stratagene) was used for the amplification 
by the following cycle with step 2 to step 4 repeated 35 times: 94°C, 3 minutes; 94°C, 1 
minute; 55°C, 1 minute; 72°C, 2 minutes; 72°C, 10 minutes. 

A 1 .3 Kb PGR fragment of predicated size was isolated and digested with Hind III 
15 and Xba I, cloned into the pRC/CMV2 vector (Invitrogen) at the Hind III and Xba I sites and 
completely sequenced (SEQ.ID.NO.rl) using the T7 DNA Sequenase™ kit (Amsham). 
j. HRUP3 (Seq. Id. Nos.:7 & 8) 
The full length hRUP3 was cloned by PCR using the : hRUP3 specific primers 5'- 
GTCCTGCCACTTCGAGACATGG-3' SEQ.ID.NO.:71 (sense, ATG as initiation codon) and 5'- 
20 GAAACTTCTCTGCCCTTACCGTC-3' SEQ.ID.NO.:72 (antisense, 3' of stop codon TAA) and 
human genomic DNA as template. TaqPlus Precision™ DNA polymerase (Stratagene) was 
used for the amplification by the following cycle with step 2 to step 4 repeated 35 times: 
94°C, 3 minutes; 94°C, 1 minute; 58°C, 1 minute; 72°C, 2 minutes; 72°C, 1 0 minutes 
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A ] .0 Kb PCR fragment of predicated size was isolated and cloned into the pCRII- 

TOPO™ vector (Invitrogen) and completely sequenced (SEQ.ID.NO.: 7)using the T7 DNA 

sequenase kit (Amsham). 

Example 2 
5 Receptor Expression 

Although a variety of cells are available to the art for the expression of proteins, it is 
most preferred that mammalian cells be utilized. The primary reason for this is predicated 
upon practicalities, i.e., utilization of, e.g., yeast cells for the expression of a GPCR, while 
possible, introduces into the protocol a non-mammalian cell which may not (indeed, in the 

10 case of yeast, does not) include the receptor-coupling, genetic-mechanism and secretary 
pathways that have evolved for mammalian systems - thus, results obtained in non- 
mammalian cells, while of potential use, are not as preferred as that obtained from 
mammalian cells. Of the mammalian cells, COS-7, 293 and 293T cells are particularly 
preferred, although the specific mammalian cell utilized can be predicated upon the particular 

15 needs of the artisan. The general procedure for expression of the disclosed GPCRs is as 
follows. , 

On day one, 1X1 0 7 293T cells per 150mm plate were plated out On day two, two 
reaction tubes will be prepared (the proportions to follow for each tube are per plate): tube 
A will be prepared by mixing 20ug DNA (e.g., pCMV vector; pCMV vector with receptor 
20cDNA, etc.) in 1.2ml serum free DMEM (Irvine Scientific, Irvine, CA); tube B will be 
prepared by mixing 120ul lipofectamine (Gibco BRL) in 1.2ml serum free DMEM. Tubes 
A and B are admixed by inversions (several times), followed by incubation at room 
temperature for 30-45min. The admixture can be referred to as the "transfection mixture". 
Plated 293T cells are washed with 1XPBS, followed by addition of 1 0ml serum free DMEM. 
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2.4ml of the transfection mixture will then be added to the cells, followed by incubation for 
4hrs at 37 0 C/5% C0 2 . The transfection mixture was then be removed by aspiration, followed 
by the addition of 25ml of DMEM/10% Fetal Bovine Serum. 'Cells will then be incubated 
at 37 0 C/5% C0 2 . After 72hr incubation, cells can then be harvested and utilized for analysis. 
5 Example 3 

Tissue Distribution of the disclosed human Gpcrs 

Several approaches can be used for determination of the tissue distribution of the 
GPCRs disclosed herein. 

1. Dot-Blot Analysis 

10 Using a commercially available human-tissue dot-blot format, endogenous orphan 

GPCRs were probed for a determination of the areas where such receptors are localized. 
cDNA fragments from the GPCRs of Example 1 (radiolabeled) were (or can be) used as the 
probe: radiolabeled probe was (or can be) generated using the complete receptor cDNA 
. (excised from the vector) using a Prime-It II™ Random Primer Labeling Kit (Stratagene, 
15 #300385), according to manufacturer's instructions. A human RNA. Master Blot™ 
■ (Clontech, #7770- 1 ) was hybridized with the endogenous human GPCR radiolabeled probe 
and washed under stringent conditions according manufacturer's instructions. The blot was 
exposed to Kodak BioMax™ Autoradiography film overnight at -80°C. Results are 
summarized for several receptors in Table B and C (see Figures 1A and IB for a grid 
20 identifying the various tissues and their locations, respectively). Exemplary dot-blots are 
provided in Figure 2A and 2B for results derived using hCHN3 and hCHN8, respectively. 

TABLE B 

Orphan GPCR . Tissue Distribution . . 

(highest levels, relative to other tissues in the dot-blot) 
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hGPCR27 
hARE-1 
hPPRl 
hRUP3 
hCHN3 
hCHN9 
hCHNIO 



-27 



Fetal brain, Putamen, Pituitary gland, Caudate nucleus 

Spleen, Peripheral leukocytes, Fetal spleen . . . 
Pituitary gland, Heart, salivary gland, Small intestine, Testis 
Pancreas . 
Fetal brain, Putamen, Occipital cortex , 
Pancreas, Small intestine, Liver 
Kidney, Thryoid • 
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Orphan GPCR 

hARE-3 V 
hGPCR3 

hARE-2 
hCHN8 



TABLE C 

Tissue Distribution 
(highest levels, relative to other tissues in the dot-blot) 

Cerebellum left, Cerebellum right, Testis, Accumbens 

Corpus collusum, Caudate nucleus, Liver, Heart, Inter- 
Ventricular Septum . 

Cerebellum left, Cerebellum right, Substantia 

Cerebellum left, Cerebellum right, Kidney, Lung 



15 



2. RT-PCR 



a. hRUP3 



To ascertain the tissue distribution of hRUP3 mRNA, RT-PCR was performed using 

hRUP3 -specific primers and human multiple tissue cDNA panels (MTC, Clontech) as 

templates. Taq DNA polymerase (Stratagene) was utilized for the PCR reaction, using the 

following reaction cycles in a 40ul reaction: 94°C for 2 min; 94°C for 15 sec; 55°C for 30 

20 sec; 72 °C for 1 min; 72° C, for 10 min. Primers were as follows: 

5 J -GACAGGTACCTTGCCATCAAG-3' (SEQ.ID.NO.;' 61; sense) 
S'-CTGCACAATGCCAGTGATAAGG-S' (SEQ.ID.NO.: 62; antisense). 

20ul of the reaction was loaded onto a 1% agarose gel; results are set forth in Figure 3. 
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Y ^, / ■ ' * ' ,' Y^ ,Y 'V'O'Y'' /'^ ''-Y^ 

V ^ As is supported by the data of Figure 3, of the 16 human tissues in the cDN A panel 
utilized (brain, colon, heart, kidney, lung, ovary , pancreas, placenta, prostate, skeleton, small > 

. .. jntpstine, spleen, tesds, thymus leukocyte, and liver), a single hRIJP3„band.is evident only 
from the pancreas. Additional comparative analysis of the protein sequence of hRUP3 with 
.5 other GPCRs suggest that hRUP3 is related to GPCRs having small molecule endogenous 
Y . ligand such that it is predicted that the endogenous ligand for hRUP3 is a small molecule. 

b. hRUP4 '^\:-( ' r /- ^/-^Y .\ ' -Yy'V-.: /, :;Y.v.. - v /^ 
RT-PCR was performed using hRUP4 oligo's 8 and 4 as primers and the human ; 

multiple tissue cDNA panels (MTC, Clontech) as templates. Taq DNA polymerase 
1 0 (Stratagene) was used for the amplification in a 40ul reaction by the following cycles: 94°C 

for 30 seconds, 94°C for 10 seconds, 55°C for 30 seconds, 72°C for 2 minutes, and 72°C for 

5 minutes with cycles 2 through 4 repeated 30 times. 

20 ill of the reaction were loaded on a 1% agarose gel to analyze the RT-PCR / 

products, and hRUP4 mRNA was found expressed in many human tissues, with the strongest 
15 expression in heart and kidney , (see, Figure 4). To confirm the authenticity of the PCR 

fragments, a 300 bp fragment derived from the 5' end of hRUP4 was used as a probe for the 

Southern Blot analysis. The probe was labeled with 32 P-dCTP using the Prime-It II™ : 

Random Primer Labeling Kit (Stratagene) and purified using the ProbeQuant™ G-50 micro . . 
: columns (Amersham). Hybridization was done overnight at 42° C following a 12 hr pre- 
20 hybridization. The blot was finally washed at 65°C with 0.1 x SSC The Southern blot did 

confirm the PCR fragments as hRUP4, 'Y'-. . , 

c. hRUP5 ' ^ - V'r : - :: r;-.^:^-;.-.-/-. ' v 
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.■ .RT-PCR was performed using the following hRUP5. specific, primers: 
5'-CTGACTTCTTGTTCCTGGCAGCAGCGG-3' (SEQ.ID.NO.: 63; sense) , ; 
5'-AGACCAGCCAGGGCACGCTGAAGAGTG-3' (SEQ.IDNO.: 64; antisense) 
and the human multiple tissue cDNA panels (MTC, Clontech) as templates. Taq DNA 
5 polymerase (Stratagene) was used for the amplification in a 40ul reaction by the following 
cycles: 94°C for 30 sec, 94°C for 1 0 sec, 62°C for 1 .5 min, 72°C for 5 min, and with cycles 
2 through 3 repeated 30 times. 20 ul of the reaction were loaded on a 1.5% agarose gel to 
analyze the RT-PCR products, and hRUP5 mRNA was found expressed only in the 
peripheral blood leukocytes (data not shown). 
10 d. hRUP6 

RT-PCR was applied to confirm the expression and to determine the tissue 
distribution of hRUP6. Oligonucleotides used, based on an alignment of AC005 871 and 
GPR66 segments, had the following sequences: 

5'-CCAACACCAGCATCCATGGCATCAAG-3' (SEQ.ID.NO.: 73; sense), '- 
15 5'-GGAGAGTCAGCTCTGAAAGAATTGAGG-3' (SEQ.ID.NO.: 74; antisense) 
and the human multiple tissue cDNA panels (MTC, Clontech) were used as templates. 
PCR was performed using TaqPlus Precision™ polymerase (Stratagene; manufacturing 
instructions will be followed) in a 40ul reaction by the following cycles: 94°C for 30 sec; 
94 °C 5 sec; 66 °C for 40 sec, 72 °C for 2.5 min, and 72 °C for 7 min: Cycles 2 through 4 
20 were repeated 30 times. ' 

20 ul of the reaction were loaded on a 1 .2% agarose gel to analyze the RT-PCR 
products, and a specific 760bp DNA fragment representing hRUP6 was expressed 
predominantly in the thymus and with less expression in the heart, kidney, lung, prostate 
small intestine and testis, (see, Figure 5). • • 
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It is intended that each of the patents, applications, and printed publications 
mentioned in this patent document be hereby incorporated by reference in their entirety. 

As those skilled in the art will appreciate, numerous changes and modifications 
may be made to the preferred embodiments of the invention without departing from the . 
spirit of the invention. It is intended that all such variations fall within the scope of the 
invention and the claims that follow. 

Although a variety of Vectors are available to those in the art, for purposes of 
utilization for both endogenous and non-endogenous human GPCRs, it is most preferred 
that the Vector utilized be pCMV. This vector was deposited with the American Type 
lOCulture Collection (ATCC) on October 13, 1998 (10801 University Blvd., Manassas, VA 
201 10-2209 USA) under the provisions of the Budapest Treaty for the International 
Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure. The 
DNA was tested by the ATCC and determined to be. The ATCC has assigned the 
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CLAIMS 

What is claimed is: 

1. AcDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 1. 

5 2. A human G protein-coupled receptor encoded by the cDNA of 

SEQ.ID.NO.: lcomprising SEQ.ID.NO.: 2. 

'. 3 - . A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:?. 

4. A Host Cell comprising the Plasmid of claim 3. 

5. A cDNA encoding a human G protein-coupled receptor comprising 
10 SEQ.ID.NO.: 3. 

6. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 3 comprising SEQ.ID.NO.: 4. 

7. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO. :3. 

8. A Host Cell comprising the Plasmid of claim 7. 

15 9 - . A cDNA encoding a human G protein-coupled receptor comprising 

SEQ.ID.NO : 5. 

10. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 5 comprising SEQ.ID.NO.: 6. 

11. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:5. 

0 12. A Host Cell comprising the Plasmid of claim 11. 

13. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 7. 
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14. : A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 7 comprising SEQ.ID.NO : 8. 

1 5. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:7. 

16. A Host Cell comprising the Plasmid of claim 15. v ; 

5 17. A cDNA encoding a human G protein-coupled receptor comprising 

SEQ.ID.NO.: 9. f 

1 8. A human G protein-coupled receptor encoded by the cDNA of ; 
SEQ.ID.NO.: 9 comprising SEQ.ID.NO.: 10. 

19. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:9. 
10 20. A Host Cell comprising the Plasmid of claim 19. 

21. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 11. 

22. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 1 1 comprising SEQ.ID.NO.:12. 

15 23. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 1 1. 

24. A Host Cell comprising the Plasmid of claim 23. 

25. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 13. 

26. A human G protein-coupled receptor encoded by the cDNA of 
20 SEQ.ID.NO.: 13 comprising SEQ.ID.NO.: 14. V 

27. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 13. 

28. A Host Cell comprising the Plasmid of claim 27. . 

29. A cDNA encoding a human G protein-coupled receptor comprising 
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SEQ.ID.NO.: 15. 

30. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 15 comprising SEQ.ID.NO : 16. 

31. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 15. 
5 32. A Host Cell comprising the Plasmid of claim 31. 

33. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 17. 

34. A human G protein-coupled receptor encoded by the cDNA of • 
SEQ.ID.NO.: 17 comprising SEQ.ID.NO.: 18. 

10 35. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 17. 

36. A Host Cell comprising the Plasmid of claim 35. 

37. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 19. 

38. A human G protein-coupled receptor encoded by the cDNA of 
15 SEQ.ID.NO.: 19 comprising SEQ.ID.NO.: 20. 

39. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 19. 

40. A Host Cell comprising the Plasmid of claim 39. . 

41. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 21. 

20 42 - A human G protein-coupled receptor encoded by the cDNA of 

SEQ.ID.NO.: 21 comprising SEQ.ID.NO.: 22. 

43. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO. :21. 

44. A Host Cell comprising the Plasmid of claim 43. 



. " • " V/ WO 0001258 • ; . / ■ , '■ " '. ' . : j:V : ; PCT/US99/23687^ .': V;, ' -' :^ 
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/•! -f- : 45. A cDNA encoding a human G protein-coupled receptor comprising . • / * ,:; 

; : v;V ■• SEQ.ID.no.: 23. r--' f : ]U, ^ ' "■■ ; ; 'y ■ ; • -l' / " . ' -V- V'VX^C- 

46. ; A human G protein-coupled receptor encoded by the cDN A of 
v SEQ.ID.NO.: 23 comprising SEQ.ID.NO.: 24. ; ; ' ; ' 

5 47. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.: 23. 

48. A Host Cell comprising the Plasmid of claim 47. , . 

. 49. A cDNA encoding a human G protein-coupled receptor comprising ? 

• ■ SEQ.ID.NO.: 25. V- Z : : ' ; . • ■ , ;:,}^^- : J : :'^'i':;,i'- 

50. A human G protein-coupled receptor encoded by the cDNA of 
10 SEQ.ID.NO.: 25 comprising SEQ.ID.NO.: 26. ■ ■ ■ 

V 51. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:25. ^ v : 

52. A Host Cell comprising the Plasmid of claim 51- : - 

53. A cDNA encoding a human G protein-coupled receptor comprising 

;■'.) ■ ' ■ SEQ.ID.NO.: 27. . - : '' ; : " v '- ;. P'' ; ; ; [ /■ v y' ; # 

15 54. A human G protein-coupled receptor encoded by the cDNA of 

SEQ.ID.NO,: 27 comprising SEQ.ID.NO.: 28. . ; / '.:' ; y\ 

55. A Plasmid comprising a Vector and the cDNA of SEQ.ID,NO.:27. 

56. A Host Cell comprising the Plasmid of claim 55. 

57. A cDNA encoding a human G protein-coupled receptor comprising 

V " ■' 20 SEQ.ID.NO.: 29. : .- ' : ^ V'"'.''. ;.' ' ' S '. ' ^A^iV =■ V . 

58. ; A human G protein-coupled receptor encoded by the cDNA of : 
SEQ.ID.NO.: 29 comprising SEQ.ID.NO.: 30. : v ^ 

- . 59. A Plasmid comprising a Vector arid the cDNA' of SEQ.ID.NO.:29/ . V ..- 
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.60. A Host Cell comprising the Plasmid. of claim 59. . 

61. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 31. 

62. A human G protein-coupled receptor encoded by the cDNA of 
5 SEQ.ID.NO.: 31 comprising SEQ.ID.NO.: 32. : ' 

63. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO. :31. 

64. . A Host Cell comprising the Plasmid of claim 63. . 

65. A cDNA encoding a human G protein-coupled receptor comprising 
SEQ.ID.NO.: 33. 

10 66. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 33 comprising SEQ.ID.NO.: 34. 

67. A Plasmid comprising a Vector and the cDNA of SEQ.ID.N0..33. 

68. A Host Cell comprising the. Plasmid of claim 67. 

69. A cDNA encoding a human G protein-coupled receptor comprising 
15 SEQ.ID.NO.: 35. 

70. A human G protein-coupled receptor encoded by the cDNA of 
SEQ.ID.NO.: 35 comprising SEQ.ID.NO.: 36. 

71. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:35. 

72. A Host Cell comprising the Plasmid of claim 71. \ 

20 73 • A cDNA encoding a human G protein-coupled receptor comprising 

SEQ.ID.NO.: 37. 

74. A human G protein-coupled receptor encoded by the cDNA.of 
SEQ.ID.NO,: 37 comprising SEQ.ID.NO.: 38. 
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75. A Plasmid comprising a Vector and the cDNA of SEQ.ID.NO.:37. 

76. A Host Cell comprising the Plasmid of claim 75. , ; 
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, SEQUENCE LISTING 

(1) GENERAL INFORMATION : 

(i) APPLICANT: Chen/ Ruoping 
Dang, Huong T. 
Liaw, Chen W. 
Lin, I -Lin 

(ii) TITLE OF INVENTION: Human Orphan G Protein -Coupled Receptors 
(iii) NUMBER OF SEQUENCES: 74 
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(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arena Pharmaceuticals, Inc. 

(B) STREET: 6166 Nancy Ridge Drive 

(C) CITY: San Diego 

(D) STATE: CA 

!5 (E) COUNTRY: USA 

(F) ZIP: 92121 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: . US 

(B) FILING DATE: 

25 " (C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

. (A) NAME: Burgoon, Richard P. 
(B) REGISTRATION NUMBER: 34,787 
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(ix) TELECOMMUNICATION INFORMATION: 
. *. (A) TELEPHONE: (858)453-7200 
(B) TELEFAX: (858)453-7210 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS : ' 
(A) LENGTH: 1260 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS :. single 

(D) TOPOLOGY: linear 

* (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ* ID NO:l: 
40 ATGGTCTTCT CGGCAGTGTT GACTGCGTTC CATACCGGGA CATCCAACAC AACATTTGTC 60 



: wooo/31258 ■ -X- i'-; 
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: GTGTATGAAA ACACCTACAT GAATATTACA CTCCCTCCAC CATTCCAGCA TCCTGACCTCM20 . 
AGTCCATTGC TTAGATATAG TTTTGAAACC ATGG.CTCCCA CTGGTTTGAG TTCCTTGACC 180 
. GTGAATAGTA CAGCTGTGCC CACAACACCA GCAGCATTTA AGAGCCTAAA CTTGCCTCTT 240 
' CAGATCACCC TTTCTGCTAT AATGATATTC ATTCTGTTTG TGTCTTTTCT TGGGAACTTG 300 
. 5GTTGTTTGCC TCATGGTTTA CCAAAAAGCT GCCATGAGGT CTGCAATTAA CATCCTCCTT 360 
• GCCAGCCTAG CTTTTGCAGA CATGTTGCTt GCAGTGCTGA ACATGCCCTT TGCCCTGGTA 420 X 
ACTATTCTTA CTACCCGATG GATTTTTGGG AAATTCTTCT GTAGGGTATC TGCTATGTTT 480 
TTCTGGTTAT TTGTGATAGA AGGAGTAGCC ATCCTGCTCA TCATTAGCAT AGATAGGTTC 540 f 
• CTTATTATAG TCCAGAGGCA GGATAAGCTA AACCCATATA GAGCTAAGGT TCTGATTGCA 600 / 
IOGTTTCTTGGG CAACTTCCTT TTGTGTAGCT TTTCCTTTAG CCGTAGGAAA CCCCGACCTG 660 
CAGATACCTT CCCGAGCTCC CCAGTGTGTG TTTGGGTACA CAACCAATCC AGGCTACCAG 720 
GCTTATGTGA TTTTGATTTC TCTCATTTCT TTCTTCATAC CCTTCCTGGT AATACTGTAC 780 
TCATTTATGG GCATACTCAA CACCCTTCGG CACAATGCCT TGAGGATCCA TAGCTACCCT 840 
. GAAGGTATAT GCCTCAGCCA GGCCAGCAAA CTGGGTCTCA TGAGTCTGCA GAGACCTTTC 900 ' 
CAGATGAGCA TTG ACATGGG CTTTAAAACA CGTGCCTTCA CCACTATTTT GATTCTCTTT 960 

SxcISi02o TCTGCTG GGCCCCATTC acca "taca occttgtggc 

; AAGCACTTTT ACTATCAGCA CAACTTTTTT GAGATTAGCA CCTGGCTACT GTGGCTCTGC1 08 0 . [■ 
TACCTCAAGT CTGCATTGAA TCCGCTGATC TACTACTGGA GGATTAAGAA ATTCCATGAT1140 
20 GCTTGCCTGG ACATGATGCC TAAGTCCTTC AAGTTTTTGC CGGAGCTCCC TGGTCACACAi 200 • ' • 
AAGCGACGGA TACGTCCTAG TGCTGTCTAT GTGTGTGGGG AACATCGGAC GGTGGTGTGA12 6 0 
(3) INFORMATION FOR SEQ ID NO:2: 

■' (i) SEQUENCE CHARACTERISTICS : . ;• 

(A > LENGTH: 419 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) • - ...X X 
(xi) SEQUENCE DESCRIPTION: SEQ"' ID NO:2: " • ' 
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.Met Val .Phe Ser. Ala Val Leu.Thr Ala Phe. His Thr Gly Thr Ser Asn 

' \ "■ 10 15 - : 
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Thr Thr Phe Val Val Tyr Glu Asn Thr Tyr Met Asn lie Thr Leu Pro 
20 25 30 



Pro Pro Phe Gin. His Pro Asp Leu Ser Pro Leu Leu Arg Tyr Ser Phe 
35 40 45 

5 Glu Thr Met Ala Pro Thr Gly Leu Ser Ser Leu Thr Val Asn Ser Thr 

'* 50 55 60 

Ala Val Pro Thr Thr Pro Ala Ala Phe Lys Ser Leu Asn Leu Pro Leu 

. 65 70 ... 75 - 80 

Gin lie Thr Leu Ser Ala He Met lie. Phe lie Leu Phe Val Ser Phe 
10 85 90 95 

Leu Gly Asn Leu Val Val Cys Leu Met Val Tyr Gin Lys Ala Ala Met 
100 105 no 



Arg Ser Ala He Asn .He Leu Leu Ala Ser Leu Ala Phe Ala Asp Met 

115 : . 120 , . 125 

15 Leu Leu Ala Val Leu Asn Met Pro Phe Ala Leu Val Thr He Leu Thr- 

130 • 135 140 

Thr Arg Trp lie Phe Gly Lys Phe Phe Cys Arg Val Ser Ala Met Phe 
145 ; 150 155 160 

Phe Trp Leu Phe Val He Glu Gly Val Ala He Leu Leu lie He Ser 
20 165 170 , 175 

He Asp Arg -Phe Leu lie. He Val Gin Arg Gin Asp Lys Leu Asn Pro 
180 185 190 

, Tyr Arg Ala Lys Val Leu He Ala Val Ser Trp Ala Thr Ser Phe Cys 
195 ' 200 205 

25 Val Ala Phe Pro Leu Ala Val Gly Asn Pro Asp Leu Gin He Pro Ser 

210 215. 220 

Arg Ala Pro Gin Cys Val Phe Gly Tyr Thr Thr Asn Pro Gly Tyr Gin 
225 230 235 240 

Ala Tyr Val He Leu He Ser" Leu lie Ser Phe Phe He Pro Phe Leu 
30 245 250 255 

Val He Leu Tyr Ser Phe Met Gly lie Leu Asn Thr Leu Arg His Asn 
260 265 270 

Ala Leu Arg lie His Ser Tyr Pro Glu Gly lie Cys Leu Ser Gin Ala 
275 280 V . 285 

35 ser Lys Leu Gly Leu Met Ser Leu Gin Arg Pro Phe Gin Met Ser He 

290 ; ' ' 295 300 

Asp Met Gly 1 Phe Lys Thr Arg Ala Phe Thr Thr lie Leu He Leu Phe 
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310 



315 
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: Ala Varthe val cy. «, P ro p he T te T^Tyr Ser ieu val 

' "0 ,'■•335 ■ -• 

Ala Thr P,:« s.r ty , ^^^-^ ^^^.^^ 

345 350 ■ 

THr T,p ^ ^ ^ u ^ au ^ ^ 

365 

Leu lie r-yr Ty r *, ^ ,„ ; ly5 ^ phe ^ - u ^-^ 

Met Met Pro Lys Ser Ph* t^o bu« t 

385 ■ 7 39? LyS Phe Leu Pro Pro Gly His Thr 

-ys Arg Arg lie Arg Pro ser Ala val ^ val ^ ^ ^ , ^ ^ 



410 415 



•> Thr Val Val 

; '15- ■• '. :. ■ . , 

(4) INFORMATION FOR SEQ ID NO : 3 : - , 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: Ills base pairs 

(B) TYPE: nucleic acid i " 

(C) STRANDEDNESS : single 
, , (D) TOPOLOGY: linear 

, (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: ' ' 

ATGTTAGCCA 'ACAGCTCCTC ' AACCAACAGT TC.GXTCTCG CGTGTCCTGA CTACCGACCT 5 „ 
^ 25 ACCCACCGCC ' TCCACTTGGT GGTCTACAGC TTGGTGCTGG GIGCCGGGCT CGGCCrCAAC 12 0 ■ 
GCGCTAGCCC TCTGGGTCTT CCTGCGCGCG CTGCGCGTGC ACTCGGTGGT GAGCGTGTAC 1Q0 
^ATGTGTAACC^ TGGCGGCCAG CGACCTGCTC TTCACCCTCT CGCTGCCCGT TCGTCTCTCC 2 <0 
TACTACGCAC TGCACCACTG GCCCXTCCCC GACCTCCTGT GGCAGAGGAO GGGCGCCATC 300 
TTCCAGATGA^ .ACATGTACGG CAGCTGCATC TTCCTGATGC TCATCAACGT GGACCGCTAC Me 
30 GCCGCCATCG ^ TGCACCCGCT GCGACTGCGC CACCTGCGGC GGCCCCGCGT GGCGCGGCTC „.'<' 
CTCTGCCTGG GCGTGTGGGC CCCATCCTG GTGTTTGCCG . TGCCCGCCGC 

AGGCCCTCGC GTTGCCGCTA CCGGGACCTb GAGGTGCGCC TATGCTTCGA GAGCTTCAGC 540 
GACGAGCTGT GGAAAGGCAG GCTGCTGCCC CTCGTGCTGC TGGCCGAGGC GCTGGGCTTC 600 * 
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CTGCTGCCCC TGGCGGCGGT GGTCTACTCG TCGGGCCGAG TCTTCTGGAC GCTGGCGCGC 660 
CCCGACGCCA CGCAGAGCCA GCGGCGGCGG AAGACCGTGC GCCTCCTGCT GGCTAACCTC 720 
• GTCATCTTCC TGCTGTGCTT CGTGCCCTAC AACAGCACGC TGGCGGTCTA CGGGCTGCTG 780 
CGGAGCAAGC TGGTGGCGGC CAGCGTGCCT GCCCGCGATC GCGTGCGCGG GGTGCTGATG 840 
5 GTGATGGTGC TGCTGGCCGG CGCCAACTGC GTGCTGGACC CGCTGGTGTA CTACTTTAGC 900 
GCCGAGGGCT TCCGCAACAC CCTGCGCGGC CTGGGCACTC CGCACCGGGC CAGGACCTCG 960 
GCCACCAACG GGACGCGGGC GGCGCTCGCG CAATCCGAAA GGTCCGCCGT CACCAGCGAC1 02 0 
GCCACCAGGC CGGATGCCGC CAGTCAGGGG CTGCTCCGAC CCTCCGACTC CCACTCTCTG1 0 8 0 
TCTTCCTTCA CACAGTGTCC CCAGGATTCC GCCCTCTGA 1119 
10(5) INFORMATION FOR SEQ ID NO:4: . 

(i) SEQUENCE CHARACTERISTICS: 

■ (A) LENGTH: 372 amino acids . 

(B) TYPE: amino acid ' 

(C) STRANDEDNESS : 

!5 (D) TOPOLOGY: not relevant 

• (ii) MOLECULE TYPE: protein . 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Leu Ala Asn Ser Ser Ser Thr Asn Ser Ser ValLeu Pro Cys Pro ' 
1 5 10 15 

Asp Tyr Arg Pro Thr His Arg Leu His Leu Val Val Tyr, Ser Leu Val 
20 25 30 

Leu Ala Ala Gly Leu Pro Leu Asn Ala Leu Ala Leu Trp Val Phe Leu ' 

35 40 - ; . . 45 

Arg Ala Leu Arg Val His Ser Val Val Ser Val Tyr Met Cys Asn Leu 
25 50 55 60 

Ala Ala Ser Asp Leu Leu Phe Thr Leu Ser Leu Pro Val Arg Leu Ser 
65 ' 70 75 80 

Tyr Tyr Ala Leu His His Trp Pro Phe Pro Asp Leu Leu Cys Gin Thr 
85 90 95 
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Thr Gly Ala lie Phe Gin Met Asn Met Tyr Gly Ser Cys He Phe Leu 
.- • 100 .105 no 

Met Leu He Asn Val Asp Arg Tyr Ala Ala lie . Val His Pro Leu Arg 

115 120 ' ' . * .' 125 ' 
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Leu Arg His Leu Arg Arg Pro Arg Val Ala Arg Leu Leu Cys Leu Gly 
130 • 135 ; 140 ./ • 

Val Trp Ala Leu lie Leu Val Phe Ala Val Pro Ala Ala Arg Val His ' 
-•• - 145- ^ : ~ 150 ~ - . -155 - 160 - 

5 Arg Pro Ser Arg Cys Arg Tyr Arg Asp Leu Glu Val Arg Leu Cys Phe 

, . —.J ■- : , 165 . . - 170 •• - : - - - : 175 

Glu Ser Phe Ser Asp Glu Leu Trp Lys Gly Arg Leu Leu Pro Leu Val 
180 185 190 

. Leu Leu Ala Glu Ala Leu Gly Phe Leu Leu Pro Leu Ala Ala Val Val 
10 , 195 200 205 

Tyr Ser Ser Gly Arg Val Phe Trp Thr Leu Ala Arg Pro Asp Ala Thr 
. ; 210 215 220 

Gin Ser Gin Arg Arg Arg Lys Thr Vai Arg Leu Leu Leu Ala Asn Leu : 
225 230 235 240 

15 Val lie Phe Leu Leu Cys Phe Val Pro Tyr Asn Ser Thr Leu Ala Val 

245 250 255 

Tyr Gly Leu Leu Arg Ser Lys Leu Val Ala Ala Ser Val Pro Ala Arg 
260 ,265 270 

Asp Arg Val Arg Gly Val Leu Met Val Met Val Leu Leu Ala Gly Ala 
20 275 280 285 _ 

Asn Cys Val Leu Asp Pro Leu Val Tyr Tyr Phe Ser Ala Glu Gly Phe 
290 - 295 300 

Arg Asn Thr Leu Arg Gly Leu - Gly Thr Pro His Arg Ala Arg Thr Ser 
305 310 - • /• 315 320 

25 Ala Thr Asn Gly Thr Arg Ala Ala Leu Ala Gin Ser Glu Arg Ser Ala 

' ..... 325 330 335 

Val Thr Thr Asp Ala Thr Arg Pro Asp Ala Ala Ser Gin Gly Leu Leu 
.340 345 350 

Arg Pro Ser Asp. Ser His Ser Leu Ser Ser Phe Thr^ Gin Cys Pro Gin 
30 355 . 360 - 365 

Asp Ser Ala Leu 

370 ^ ' '/",;. • • • 

. (6) INFORMATION FOR SEQ ID NO: 5 : 

(i)' SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 1107 base pairs*. ' ' - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single .. 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : DNA (genomic) ... . . : „• 

{xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 5: . .' . ' 

ATGGCCAACT CCACAGGGCT GAACGCCTCA GAAGTCGCAG GCTCGTTGGG GTTGATCCTG 60 

GCAGCTGTCG TGGAGGTGGG GGCACTGCTG GGCAACGGCG CGCTGCTGGT CGTGGTGCTG 120 

5 CGCACGCCGG GACTGCGCGA CGCGCTCTAC CTGGCGCACC TGTGCGTCGT GGACCTGCTG 180 

_ GCGGCCGCCT CCATCATGCC GCTGGGCCTG CTGGCCGCAC CGCCGCCCGG GCTGGGCCGC 240 

GTGCGCCTGG GCCCCGCGCC ATGCCGCGCC GCTCGCTTCC TCTCCGCCGC TCTGCTGCCG 300 

GCCTGCACGC TCGGGGTGGC CGCACTTGGC CTGGCACGCT ACCGCCTCAT CGTGCACCCG 360 

CTGCGGCCAG GCTCGCGGCC GCCGCCTGTG CTCGTGCTCA CCGCCGTGTG GGCCGCGGCG 420 

10 GGACTGCTGG GCGCGCTCTC CCTGCTCGGC CCGCCGCCCG . CACCGCCCCC TGCTCCTGCT 4 80 

CGCTGCTCGG TCCTGGCTGG GGGCCTCGGG CCCTTCCGGC CGCTCTGGGC CCTGCTGGCC 540 

TTCGCGCTGC CCGCCCTCCT GCTGCTCGGC GCCTACGGCG GCATCTTCGT GGTGGCGCGT 600 

CGCGCTGCCC TGAGGCCCCC ACGGCCGGCG CGCGGGTCCC GACTCCGCTC . GGACTCTCTG 660 

GATAGCCGCC TTTCCATCTT GCCGCCGCTC CGGCCTCGCC TGCCCGGGGG CAAGGCGGCC 720 

1 5 CTGGCCCCAG CGCTGGCCGT GGGCCAATTT GCAGCCTGCT GGCTGCCTTA TGGCTGCGCG 780 

TGCCTGGCGC CCGCAGCGCG GGCCGCGGAA GCCGAAGCGG CTGTCACCTG GGTCGCCTAC ,840 

TCGC3CCTTCG . CGGCTCACCC CTTCCTGTAC GGGCTGCTGC AGCGCCCCGT GCGCTTGGCA 900 

CTGGGCCGCC TCTCTCGCCG TGCACTGCCT GGACCTGTGC GGGCCTGCAC TCCGCAAGCC 960 

' TGGCACCCGC GGGCACTCTT GCAATGCCTC ' CAGAGACCCC CAGAGGGCCC TGCCGTAGGCl 02 0 

20 CCTTCTGAGG CTCCAGAACA GACCCCCGAG TTGGCAGGAG. GGCGGAGCCC CGCATACCAG1 080 > 

GGGCCACCTG AGAGTTCTCT CTCCTGA li07 

(7) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids - 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein ' : ' ," . 

. (xi) SEQUENCE DESCRIPTION: SEQ Ili NO: 6: . , 



Met Ala Asn Ser Thr Gly Leu Asn Ala Ser Glu Val Ala Gly Ser Leu 



10 



15 



Gly Leu lie Leu Ala Ala Val Val Glu Val Gly Ala Leu Leu Gly Asn 



20 



25 



. 30 



Gly Ala Leu Leu Val Val Val Leu Arg Thr Pro Gly Leu Arg Asp Ala 
35 40 



45. .. 

Leu Tyr Leu Ala His Leu Cys Val Val Asp Leu Leu Ala Ala Ala Ser 
50 55 60 , 

lie Met Pro Leu Gly Leu Leu Ala Ala Pro Pro Pro Gly Leu Gly Arg 

65 ,• v ' : - '•■ 70 '•. ' 75" "'. . 80 .. 

Val Arg Leu Gly Pro Ala Pro Cys Arg Ala Ala Arg Phe Leu Ser Ala 

■' 85 ■ ,-■ Y- 90 : , 95 

Ala Leu Leu Pro Ala Cys Thr Leu Gly Val . Ala Ala Leu Gly Leu Ala 

• 100 105 no ; 

Arg Tyr Arg Leu lie Val His Pro Leu Arg Pro Gly Ser Arg Pro Pro 
115 120 125 

Pro Val Leu Val Leu Thr Ala Val Trp Ala Ala Ala Gly Leu Leu Gly 
130 135 - 140 ■ v 

Ala Leu Ser Leu Leu Gly Pro Pro Pro Ala Pro Pro Pro Ala Pro Ala 
145 . : 150 - 155 ' 160 

Arg Cys Ser Val Leu Ala Gly Gly Leu Gly Pro Phe Arg Pro Leu Trp 
.. 165 • 170 175 

Ala Leu Leu Ala Phe Ala Leu Pro Ala Leu Leu Leu Leu Gly Ala Tyr 

. 180 ... * ; ... ,.- 185 '.: • v iso 

Gly Gly lie Phe Val Val Ala Arg Arg Ala Ala Leu Arg Pro Pro Arg 
195 200 205 

Pro Ala Arg Gly Ser Arg Leu "Arg Ser Asp Ser Leu Asp Ser Arg Leu 



210 



215 



220 



Ser lie Leu Pro Pro Leu Arg Pro Arg Leu Pro Gly. Gly Lys Ala Ala 
225 • ; 230 235 . 240 

Leu Ala Pro Ala Leu Ala Val Gly Gin Phe Ala Ala Cys Trp Leu Pro 
245 250 , 255 

Tyr Gly Cys Ala Cys Leu Ala Pro Ala Ala Arg Ala Ala Glu Ala Glu 
260 265 270 

Ala Ala Val Thr Trp Val Ala Tyr Ser Ala Phe Ala Ala His Pro Phe 
275 280 . „ 285 

Leu Tyr Gly Leu Leu Gin Arg Pro Val Arg Leu Ala Leu Gly Arg Leu 
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• -,. 290 • ... . ' 2.95 ■ - 300 . . 

Ser Arg Arg Ala Leu Pro Gly Pro Val Arg Ala Cys Thr Pro Gin Ala 
305 ' 310 315 . 320 

Trp His Pro Arg Ala Leu Leu Gin Cys Leu Gin Arg Pro Pro Glu Gly 
5 325 ' . 330 - 335 

Pro Ala Val Gly Pro Ser Glu Ala Pro Glu Gin Thr Pro Glu Leu Ala 
340 ' 345 350 

Gly Gly Arg Ser Pro Ala Tyr Gin Gly Pro Pro Glu Ser Ser Leu Ser 
- 355 .360 ,'. 365 

10 (8) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
: (A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
.15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATGGAATCAT CTTTCTCATT TGGAGTGATC CTTGCTGTCC TGGCCTCCCt CATCATTGCT 60 
ACTAACACAC TAGTGGCTGT GGCTGTGCTG CTGTTGATCC ACAAGAATGA TGGTGTCAGT 120 

20 CTCTGCTTCA CCTTGAATCT GGCTGTGGCT GACACCTTGA TTGGTGTGGC CATCTCTGGC 180 
CTACTCACAG ACCAGCTCTC CAGCCCTTCT CGGCCCACAC AGAAGACCCT GTGCAGCCTG 240 
CGGATGGCAT TTGTCACTTC CTCCGCAGCT GCCTCTGTCC TCACGGTCAT GCTGATCACC 300 
TTTGACAGGT. ACCTTGCCAT. CAAGCAGCCC TTCCGCTACT TGAAGATCAT GAGTGGGTTC 360 
GTGGCCGGGG CCTGCATTGC CGGGCTGTGG TTAGTGTCTT ACCTCATTGG CTTCCTCCCA 420 

25 CTCGGAATCC • CCATGTTCCA GCAGACTGCC TACAAAGGGC AGTGCAGCTT CTTTGCTGTA 480 
TTTCACCCTC ACTTCGTGCT GACCCTCTCC TGCGTTGGCT TCTTCCCAGC CATGCTCCTC 540 
TTTGTCTTCT TCTACTGCGA CATGCTCAAG ATTGCCTCCA TGCACAGCCA GCAGATTCGA 600 
AAGATGGAAC ATGCAGGAGC CATGGCTGGA GGTTATCGAT CCCCACGGAC TCCCAGCGAC 660 
TTCAAAGCTC TCCGTACTGT GTCTGTTCTC ATTGGGAGCT TTGCfCTATC CTGGACCCCC 720 
30 TTCCTTATCA CTGGCATTGT GCAGGTGGCC TGCCAGGAGT GTCACCTCTA* CCTAGTGCTG 780 
GAACGGTACC TGTGGCTGCT CGGCGTGGGC AACTCCCTGC TCAACCCACT CATCTATGCC 840 
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TATTGGCAGA AGGAGGTGCG ACTGCAGCTC TACCACATGG CCCTAGGAGT GAAGAAGGTG 900 
CTCACCTGAT TCCTCCTCTT TCTCTCGGCC JUSGAATTGTG GCCCAGAGAG GCCCAGGGAA 960 
AGTTCCTGTC ACATCGTCAC TATCTCCAGC TGAGAGTTTG ATGGCTAA 1008 
(9) INFORMATION FOR SEQ ID NO: 8: ' . 

5 (i) SEQUENCE CHARACTERISTICS : • .. 7 ~ " : ■ ' ". 

(A) LENGTH.: 335 amino acids 

(B) TYPE: amino acid v 

(C) STRANDEDNESS : 

. (D) TOPOLOGY: not relevant 

10 (ii) MOLECULE TYPE: protein ^ ; ' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

: Met Glu Ser Ser Phe Ser Phe Gly Val lie Leu Ala Val Leu Ala Ser 
1 5 10 , 15 

Leu He lie Ala Thr Asn Thr Leu Val Ala Val Ala Val Leu Leu Leu 
15 20 25 30 

: He His Lys Asn Asp Gly Val Ser Leu Cys Phe Thr Leu Asn Leu Ala 
35 40 45 

Val Ala Asp Thr Leu lie Gly Val Ala He Ser Gly Leu Leu Thr Asp 
50 < . 55 60 

20 Gin Leu Ser Ser Pro Ser Arg Pro Thr Gin Lys Thr Leu Cys Ser Leu 

65 70 . ,75 ; 80 

Arg Met Ala Phe Val Thr Ser Ser Ala Ala Ala Ser Val -Leu Thr Val. 

85 ;• . 90 95 

Met Leu lie Thr Phe Asp Arg Tyr Leu Ala lie. Lys Gin Pro Phe Arg 
25 .' 100 105/ no 

Tyr Leu Lys He Met Ser Gly Phe Val Ala Gly Ala Cys He Ala Gly 

/ 115 , 120 , 125 ■ 

Leu Trp Leu Val Ser Tyr Leu He Gly -Phe Leu Pro Leu Gly lie Pro 
130 135 140 

30 Met Phe Gin Gin Thr Ala Tyr Lys Gly. Gin Cys Ser Phe Phe Ala Val 

' 145 150 155 i6o : 

Phe His Pro. His Phe Val Leu Thr Leu Ser Cys Val Gly Phe Phe Pro 
. 165 170 : 175 

Ala Met Leu Leu Phe Val Phe Phe Tyr Cys Asp Met Leu Lys He Ala 

35 , 180 ; . . - ... 185 ' - . 190 " 



WO 00/31258 PCT/US99/23687 

.' : •' - .. ...... -ll- ■ . ; ; 

Ser Met His Ser Gin Gin lie Arg Lys Met Glu His Ala Gly Ala Met 
195 200 205 

Ala Gly Gly Tyr Arg Ser Pro Arg Thr Pro Ser Asp Phe Lys Ala Leu 
210 . 215 220 

5 ; Arg Thr Val Ser Val Leu lie Gly. Ser Phe Ala Leu Ser .Trp Thr Pro • 
? " 230 ' "5 240 

Phe Leu lie Thr Gly lie Val Gin Val Ala Cys Gin Glu Cys His Leu 
245 250 255 

Tyr Leu Val Leu Glu Arg Tyr Leu, Trp Leu Leu Gly Val Gly Asn Ser 
10 260 265 270 

Leu Leu Asn Pro Leu He Tyr Ala Tyr Trp Gin Lys Glu Val Arg Leu 
275 280 285 

Gin Leu Tyr His Met Ala Leu Gly Val Lys Lys Val Leu Thr Ser Phe 
290 295 300 

15 Leu Leu Phe Leu Ser Ala Arg Asn Cys Gly Pro Glu Arg Pro Arg Glu 

305 310 ' 320 

! 

Ser Ser Cys His He Val Thr lie Ser Ser Ser Glu Phe Asp Gly - 
325 330 335 

(10) INFORMATION FOR SEQ ID NO: 9: 

20 (i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 1413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE : DNA (genomic) 

,(xi) SEQUENCE DESCRIPTION: SEQ ID "NO: 9: 
ATGGACACTA CCATGGAAGC TGACCTGGGT GCCACTGGCC ACAGGCCCCG CACAGAGCTT 60 
GATGATGAGG ACTCCTACCC CCAAGGTGGC TGGGACACGG TCTTCCTGGT GGCCCTGCTG 120 
CTCCTTGGGC TGCCAGCCAA TGGGTTGATG GCGTGGCTGG CCGGCTCCCA GGCCCGGCAT 180 
30 GGAGCTGGCA CGCGTCTGGC GCTGCTCCTG CTCAGCCTGG CCCTCTCTGA CTTCTTGTTC 240 
CTGGCAGCAG CGGCCTTCCA GATCCTAGAG ATCCGGCATG GGGGACACTG GCCGCTGGGG 300 
ACAGCTGCCT GCCGCTTCTA CTACTTCCTA TGGGGCGTGT CCTACTCCTC CGGCCTCTTC 360 
CTGCTGGCCG CCCTCAGCCT CGACCGCTGC CTGCTGGCGC TGTGCCCACA CTGGTACCCT 420 
GGGCACCGCC CAGTCCGCCT GCCCCTCTGG GTCTGCGCCG GTGTCTGGGT GCTGGCCACA 480 
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; CTCTTCAGCG TGCCCTGGCT GGTCTTCCCC GAGGCTGCCG TCTGGTGGTA CGACCTGGTC 540 
ATCTGCCTGG ACTTCTGGGA CAGCGAGGAG CTGTCGCTGA GGATGCTGGA GGTCCTGGGG 600.' 
; GGCTTCCTGC CTTTCCTCCT GCTGCTCGTC TGCCACGTGC TCACCCAGGC CACAGCCTGT 660 
CGCACCTGCC ACCGCCAACA GCAGCCCGCA GCCTGCCGGG GCTTCGCCCG TGTGGCCAGG 720 
5 ACCATTCTGT CAGCCTATGT GGTCCTGAGG ; CTGCCCTACC AGCTGGCCCA GCTGCTCTAC 780 
CTGGCCTTCC TGTGGGACGT CTACTCTGGC TACCTGCTCT GGGAGGCCCT GGTCTACTCC 840 
GACTACCTGA TCCTACTCAA CAGCTGCCTC AGCCCCTTCC TCTGCCTCAT GGCCAGTGCC 900 
GACCTCCGGA CCCTGCTGCG CTCCGTGCTC TCGTCCTTCG CGGCAGCTCT CTGCGAGGAG 960 ' 
CGGCCGGGCA GCTTCACGCC CACTGAGCCA CAGACCCAGC TAGATTCTGA GGGTCCAAGT1020 

1 0 CTGCCAGAGC CGATGGCAGA GGCCCAGTCA CAGATGGATC CTGTGGCCCA G CCTCAGGTG1 080 
AACCCCACAC TCCAGCCACG ATCGGATCCC ACAGCTCAGG CACAGCTGAA CCCTACGGCC1 14 0 
CAGCCACAGT CGGATCCCAC AGCCCAGCCA CAGCTGAACC TCATGGCCCA GCCACAGTCA1200 
GATTCTGTGG CCCAGCCACA GGCAGACACT AACGTCCAGA CCCCTGCACC TGCTGCCAGT1260 
TCTGTGCGCA GTCCCTGTGA TGAAGCTTCC CCAACCCCAT CCTCGCATCC TACCCCAGGG1320 

15 GCCCTTGAGG ACCCAGCCAC ACCTCCTGCC TCTGAAGGAG AAAGCCCCAG CAGCACCCCG 1380 

CCAGAGGCGG CCCCGGGCGC AGGCCCCACG TGA 1413 

(11) INFORMATION. FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : . \ .■/-".-■ 
.(D) TOPOLOGY: not relevant , 

(ii) MOLECULE TYPE: protein ' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



25 



Met Asp Thr Thr Met Glu Ala Asp Leu Gly Ala Thr Gly His Arg Pro 

1 ' 5 . , .. 10 15 

Arg Thr Glu Leu Asp Asp Glu- Asp Ser Tyr Pro Gin Gly Gly Trp Asp 

20 ; 25 - .30 

Thr Val Phe Leu Val Ala Leu Leu Leu Leu Gly Leu Pro Ala Asn Gly 

30 . . 35 40. . 45 : . . .. . 

Leu Met Ala Trp Leu Ala Gly Ser Gin Ala Arg His Gly Ala Gly Thr 
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50 



55 



60 



Arg Leu Ala Leu Leu Leu Leu Ser Leu Ala Leu Ser Asp Phe Leu Phe 
65 70 75 80 

Leu Ala Ala Ala Ala Phe Gin He Leu Glu lie Arg His Gly Gly His 

85 - : 90 . . . 95 

Trp Pro Leu Gly Thr Ala Ala Cys Arg Phe Tyr Tyr Phe Leu Trp Gly 

100 105 no 

Val Ser Tyr Ser Ser Gly Leu Phe Leu Leu Ala Ala Leu Ser Leu Asp 
115 .. 120 125 



10 



Arg Cys Leix Leu Ala Leu Cys Pro His Trp , Tyr Pro Gly. His Arg Pro 
130 135 140 



Val Arg Leu Pro Leu Trp Val Cys Ala Gly Val Trp Val Leu Ala Thr 
145 _ 150 . 155 160 

Leu Phe Ser Val Pro Trp Leu Val Phe Pro Glu Ala Ala Val Trp Trp 
15 165 170 175 

Tyr Asp Leu Val He Cys Leu Asp Phe Trp Asp Ser Glu Glu Leu Ser 
180 185 190 

Leu Arg Met Leu Glu Val Leu Gly Gly Phe Leu Pro Phe Leu Leu Leu 
195 200 ' - 205 



20 



Leu Val Cys His Val Leu Thr Gin Ala Thr Arg Thr Cys His Arg Gln> 
210 215 220 



Gin Gin Pro Ala Ala Cys Arg Gly Phe Ala Arg Val Ala Arg. Thr lie 
225 230 . 235 240 

Leu Ser Ala Tyr Val Val Leu Arg Leu Pro Tyr Gin Leu Ala Gin Leu 
25 . 245 250 255 

Leu Tyr Leu Ala Phe Leu Trp Asp Val Tyr Ser Gly Tyr Leu Leu Trp 
260 265 270 

Glu Ala Leu Val Tyr Ser Asp Tyr Leu* He Leu Leu Asn Ser Cys Leu 
275 280 ' 285 



30 



Ser Pro Phe Leu Cys Leu Met Ala Ser Ala Asp Leu Arg Thr Leu Leu 
290 295 300 



35 



Arg Ser Val Leu Ser Ser Phe Ala Ala Ala Leu Cys Glu Glu Arg Pro 

305 310 315 320 

Gly Ser, Phe Thr Pro Thr Glu Pro Gin Thr Gin Leu Asp Ser Glu Gly 

325 330 ^ 335 



Pro Thr Leu Pro Glu Pro Met Ala Glu Ala Gin Ser Gin Met Asp Pro 
340 .. 345 350 
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.Val Ala Gin Pro Gin Val Asn Pro Thr Leu Gin Pro Arg Ser Asp Pro 
355 . 360 ,365 

Thr Ala Gin Pro Gin Leu Asn Pro Thr Ala Gin Pro Gin Ser Asp Pro 
i - _ . .370 „ _ _ , . 375 . '. - ^ .:. 380 ~ - J ' \ ... 

5 Thr Ala Gin Pro Girl Leu Asn Leu Met Ala Gin Pro Gin Ser Asp Ser 

v : 385 -, -.L r . . 390 - - ,. : .. - 395 -; : : 400 

. Val Ala Gin Pro Gin Ala Asp Thr Asn Val Gin Thr Pro , Ala Pro Ala 

405 410 415 • 

Ala Ser Ser Val Pro Ser Pro Cys Asp Glu Ala Ser Pro Thr Pro Ser 
10 420 425 430 \ ' " 

Ser His Pro Thr Pro Gly Ala Leu Glu Asp Pro Ala Thr Pro Pro Ala 
< 435 440 ■;. 445. . 

Ser Glu Gly Glu Ser Pro Ser Ser Thr Pro Pro Glu Ala Ala Pro Gly 
450 455 460 

15 Ala Gly Pro Thr 

• ' 465 . .'•* .* ... 

(12) INFORMATION FOR SEQ ID NO: 11: 

.". (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1248 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single . > . ^ 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



- (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: ; • 

25 ATGTCAGGGA TGGAAAAACT TCAGAATGCT TCCTGGATCT ACCAGCAGAA ACTAGAAGAT 60 
CCATTCCAGA AACACCTGAA CAGCACCGAG GAGTATCTGG CCTTCCTCTG CGGACCTCGG 120 
CGCAGCCACT TCTTCCTCCC CGTGTCTGTG GTGTATGTGC CAATTTTTGT GGTGGGGGTC 180 
ATTGGCAATG TCCTGGTGTG CCTGGTGATT CTGCAGCACC AGGCTATGAA GACGCCCACC 240 * 
AACTACTACC TCTTCAGCCT GGCGGTCTCT GACCTCCTGG TCCTGCTCCT TGGAATGCCC 300 
30 CTGGAGGTCT ATGAGATGTG GCGCAACTAC CCTTTCTTGT TCGGGCCCGT GGGCTGCTAC 360 
. TTCAAGACGG CCCTCTTTGA GACCGTGTGC TTCGCCTCCA TCCTCAGCAT CACCACCGTC 4 2 0 
AGCGTGGAGC GCTACGTGGC CATCCTA.CAC CCGTTCCGCG CCAAACTGCA GAGCACCCGG 4 80 
; CGCCGGGCCC TCAGGATCCT CGGCATCGTC TGGGGCTTCT CCGTGCTCTT CTCCCTGCCC 540 
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AACACCAGCA TCCATGGCAT CAAGTTCCAC TACTTCCCCA ATGGGTCCCT GGTCCCAGGT 600 

TCGGCCACCT GTACGGTCAT CAAGCCCATG TGGATCTACA ATTTCATCAT CCAGGTCACC 660 ' 

TCCTTCCTAT TCTACCTCCT CCCCATGACT GTCATCAGTG TCCTCTACTA CCTCATGGCA 720 

CTCAGACTAA AGAAAGACAA ATCTCTTGAG GCAGATGAAG GGAATGCAAA TATTCAAAGA 780 

5 CCCTGCAGAA AATCAGTCAA CAAGATGCTG TTTGTCTTGG TCTTAGTGTT TGCTATCTGT 840 

TGGGCCCCGT TCCACATTGA CCGACTCTTC TTCAGCTTTG TGGAGGAGTG GAGTGAATCC 900 

CTGGCTGCTG TGTTCAACCT CGTCCATGTG GTGTCAGGTG TCTTCTTCTA CCTGAGCTCA 960 

GCTGTCAACC CCATTATCTA TAACCTACTG TCTCGCCGCT TCCAGGCAGC ATTCCAGAAT1 020 

GTGATCTCTT CTTTCCACAA ACAGTGGCAC TCCCAGCATG ACCCACAGTT GCCACCTGCC1080 

1 0 CAGCGGAACA TCTTCCTGAC AGAATGCCAC TTTGTGGAGC TGACCGAAGA TATAGGTCCC1140 

CAATTCCCAT GTCAGTCATC CATGCACAAC TCTCACCTCC CAACAGCCCT CTCTAGTGAA12 00 • 

CAGATGTCAA GAACAAACTA TCAAAGCTTC CACTTTAACA AAACCTGA 1248 

(13) INFORMATION FOR SEQ ID NO: 12: " ,'. 

(i) SEQUENCE CHARACTERISTICS: i* 
15 . (A) LENGTH: 415 amino acids 

(B) TYPE: amino acid ... 1 

(C) STRANDEDNESS : 

; (P) TOPOLOGY: not relevant - 

(ii) MOLECULE TYPE: protein 

20. «(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ser Gly Met Glu Lys Leu Gin Asn Ala Ser Trp He Tyr Gin Gin 

■ . L y s L eu Glu Asp Pro Phe Gin Lys His Leu Asn Ser Thr Glu Glu Tyr 
• 20 ■ 25 30 

25 . Leu Ala Phe Leu Cys Gly Pro Arg Arg Ser His- Phe Phe Leu Pro Val 

• 35 : ' -' ■ v =40 ' * . . 45 ": ■ 

: Ser Val Val Tyr Val Pro He Phe Val Val Gly Val He Gly Asn Val 

V 50 ■ ' 55 .. . ... 60 

Leu Val Cys Leu Val, He Leu Gin His Gin Ala Met Lys Thr Pro Thr 
,0 65 . ■ 7° 75 80 

'-■ Asn Tyr Tyr Leu Phe Ser Leu Ala Val Ser Asp Leu Leu Val Leu Leu 

.. . ... . 85 -90 • , 95 



Leu Gly Met Pro Leu Glu Val Tyr Glu Met Trp Arg Asn Tyr Pro Phe 

' : 100 los . no 

Leu Phe Gly Pro Val Gly Cys Tyr Phe Lys Thr Ala Leu Phe Glu Thr 

115 ' 120 ■"" . 125 " ■ 

Val Cys Phe Ala Ser lie Leu Ser lie Thr Thr Val Ser Val Glu Arg 

. . 130 , . , . 135 _ 140 , 

Tyr Val Ala lie Leu His Pro Phe Arg Ala Lys Leu Gin Ser Thr Arg 
■ 14S 'v. '150, : . 155 ■ .160. 

Arg Arg Ala Leu Arg lie Leu Gly. lie Val Trp Gly Phe Ser Val Leu 

. I 65 170 •; 175 . \ 

Phe Ser Leu Pro Asn Thr Ser lie His Gly- lie Lys Phe His Tyr Phe 

180 , ' 185.',' ' 190 ' 

Pro Asn Gly Ser Leu Val Pro Gly Ser Ala Thr Cys Thr Val lie Lys 

195 ... 200 . . - 205 

Pro Met Trp He Tyr Asn Phe lie He Glri Val Thr Ser Phe Leu Phe 
210 , 215 220 

Tyr Leu Leu Pro Met Thr Val lie Ser Val Leu Tyr Tyr Leu Met Ala 
225 230 235 240 

Leu Arg Leu Lys Lys Asp Lys Ser Leu Glu Ala Asp Glu Gly Asn Ala 

245 250./-.:/ 255 

Asn He Gin Arg Pro Cys Arg Lys Ser' Val Asn Lys Met Leu Phe Val 
260 265 270 

Leu Val Leu Val Phe Ala He Cys Trp Ala Pro Phe His lie Asp Arg 
2?5 280 . 285 

Leu Phe Phe Ser Phe Val Glu Glu Trp Ser Glu Ser Leu Ala Ala Val 
.290 295 300 

Phe Asn Leu Val His Val Val Ser Gly Val Phe Phe Tyr Leu Ser Ser 
305 ' 310 ; 315 320 

Ala Val Asn Pro He. He Tyr Asn Leu Leu Ser Arg Arg Phe Gin Ala 

'■ 325 ; . 330 . . . : ... 335 . 

Ala Phe Gin Asn Val lie Ser Ser Phe His Lys Gin Trp His Ser Gin 

340 345 350 

His Asp Pro Gin Leu Pro Pro Ala Gin Arg Asn He Phe Leu Thr Glu 

355 360 ' 365 



Cys His Phe Val .Glu Leii Thr Glu Asp lie Gly Pro Gin Phe Pro Cys 
; , 370 : . x . *. ; 375 ; .. _ _.• . .. : .; 380.:. 

Gin Ser Ser Met His Asn Ser His Leu Pro Thr Ala Leu Ser Ser Glu 

385 . - 390 395 400 
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Gin .Met Ser Arg Thr Asn Tyr Gin Ser Phe His Phe Asn Lys Thr 
405 410 415 

(14) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) • LENGTH: 1173 base pairs 

(B) 'TYPE: nucleic acid 

(C) STRANDEDNESS : single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic); 

10 \ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: . 

ATGCCAGATA CTAATAGCAC AATCAATTTA TCACTAAGCA CTCGTGTTAC TTTAGCATTT 60 
TTTATGTCCT TAGTAGCTTT TGCTATAATG CTAGGAAATG CTTTGGTCAT TTTAGCTTTT 120 
GTGGTGGACA AAAACCTTAG ACATCGAAGT AGTTATTTTT TTCTTAACTT GGCCATCTCT 180 
. GACTTCTTTG TGGGTGTGAT CTCCATTCCT TTGTACATCC CTCACACGCT' GTTCGAATGG 240 
15 GATTTTGGAA AGGAAATCTG TGTATTTTGG CTCACTACTG ACTATCTGTT ATGTACAGCA 3 00 
TCTGTATATA ACATTGTCCT CATCAGCTAT GATCGATACC TGTCAGTCTC AAATGCTGTG 360 - 
TCTTATAGAA CTCAACATAC TGGGGTCTTG AAGATTGTTA CTCTGATGGT GGCCGTTTGG 420 
GTGCTGGCCT TCTTAGTGAA TGGGCCAATG ATTCTAGTTT CAGAGTCTTG GAAGGATGAA 480 / 
. GGTAGTGAAT GTGAACCTGG ATTTTTTTCG GAATGGTACA TCCTTGCCAT CACATCATTC 540 
20 TTGGAATTCG TGATCCCAGT CATCTTAGTC GCTTATTTCA ACATGAATAT TTATTGGAGC 600 
CTGTGGAAGC GTGATCATCT CAGTAGGTGC CAAAGCCATC CTGGACTGAC TGCTGTCTCT 660 
TCCAACATCT GTGGACACTC ATTCAGAGGT AGACTATCTT CAAGGAGATC TCTTTCTGCA 720 
TCGACAGAAG TTCCTGCATC CTTTCATTCA GAGAGACAGA GGAGAAAGAG TAGTCTCATG 780 - 
TTTTCCTCAA GAACCAAGAT GAATAGCAAT ACAATTGCTT CCAAAATGGG TTCCTTCTCC 840 ' 
25 CAATCAGATT CTGTAGCTCT TCACCAAAGG GAACATGTTG AACTGCTTAG AG CCAGGAGA 900 
TTAGCCAAGT CACTGGCCAT TCTCTTAGGG GTTTTTGCTG TTTGCTGGGC TCCATATTCT 960 
CTGTTCACAA TTGTCCTTTC ATTTTATTCC TCAGCAACAG GTCCTAAATC AGTTTGGTAT1 02 0 
AGAATTGCAT TTTGGCTTCA GTGGTTCAAT TCCTTTGTCA ATCCTCTTTT GTATCCATTG1080 
TGTCACAAGC GCTTTCAAAA GGCTTTCTTG AAAATATTTT GTATAAAAAA GCAACCTCTA1 140 
30 CCATCACAAC ACAGTCGGTC AGTATCTTCT TAA - * H 7 -> 
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18 



,(15)" INFORMATION FOR SEQ ID NO: 14:, 

(i) SEQUENCE CHARACTERISTICS: \ 

(A) LENGTH: 390 amino a c ids 

(B) TYPE: amino acid 

5- --V". - (C) STRANDEDNESS :- " - ; " 
■ ■ : (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein ; 



10 



15 



20 



25 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

Met Pro Asp Thr Asn Ser Thr . He Asn Leu Ser Leu Ser Thr Arg Val 



10 



15 



Thr Leu Ala' Phe Phe Met Ser Leu Val Ala Phe Ala lie Met Leu Gly 

20 \ 25 30 

Asn Ala Leu Val lie Leu Ala Phe Val Val Asp Lys Asn Leu Arg His 

\ 35 : ■ 40 45 

Arg Ser Ser Tyr Phe Phe Leu Asn Leu Ala He Ser Asp Phe Phe Val 
50 , 55 60 

Gly Val He Ser lie Pro Leu Tyr lie Pro His Thr Leu Phe Glu Trp 
65 70 75 • 80 

Asp Phe Gly Lys Glu He Cys Val Phe Trp Leu Thr Thr Asp Tyr Leu 
85 90. 95 

Leu Cys Thr Ala Ser Val Tyr Asn lie Val Leu lie Ser Tyr Asp Arg 
100 105 no 

Tyr Leu Ser Val Ser Asn Ala Val Ser Tyr : Arg Thr Gin His Thr Gly 
115 120 ; 125 

Val Leu Lys lie Val -Thr Leu Met Val Ala Val Trp Val Leu Ala Phe 
130 135 140 

Leu Val Asn Gly Pro Met lie Leu Val Ser Glu Ser Trp Lys Asp Glu 
145 150 155 160. 

Gly Ser Glu Cys Glu Pro .Gly Phe Phe Ser Glu Trp Tyr lie Leu Ala 
165 170 175 

He Thr Ser Phe Leu Glu Phe Val He Pro Val He Leu Val Ala' Tyr 
' 180 las : 190 , 

Phe Asn Met Asn lie Tyr Trp . Ser Leu Trp Lys- Arg Asp His Leu Ser 
' 195 200 • . - 205 

Arg Cys Gin Ser' His Pro Gly Leu Thr Ala Val Ser Ser Asn He Cys 

210 . . 215 , • : : . : r ,220 . • ' :•' ' 
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Gly His Ser Phe Arg Gly Arg Leu Ser Ser Arg Arg Ser Leu Ser Ala 
225 230 235 240 

Ser Thr Glu Val' Pro Ala Ser Phe His Ser Glu Arg Gin" Arg Arg Lys 
245 ?50 255 

5 • Ser Ser Leu Met Phe Ser Ser Arg Thr Lys Met Asn Ser Asn Thr lie 

260 265 270 

Ala Ser Lys Met Gly. Ser Phe Ser Gin Ser Asp Ser Val Ala Leu His 
275 280 285 .. 

Gin Arg Glu His Val Glu Leu Leu Arg Ala Arg Arg. Leu Ala Lys Ser 
10 290 295 .. 300 

Leu Ala lie Leu Leu Gly Val Phe Ala Val Cys Trp Ala Pro Tyr Ser 
305 - 310 . 315 - 320 

Leu Phe Thr He Val Leu Ser Phe Tyr Ser Ser Ala Thr Gly Pro Lys 
325 330 335 

15 Ser Val Trp Tyr Arg lie Ala Phe Trp Leu Gin Trp Phe Asn Ser Phe 

340 345 350 

Val Asn Pro Leu Leu Tyr Pro Leu Cys His Lys Arg Phe Gin Lys Ala 
355 ' 360 . 365 

Phe Leu Lys He Phe Cys lie Lys Lys Gin Pro Leu Pro Ser Gin His 

20 370 375 .... . 380 

Ser Arg Ser Val Ser Ser 
385 390 

(16) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
25 ' (A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) ' STRANDEDNESS : single 

(D) TOPOLOGY: linear 

; (ii) MOLECULE TYPE : DNA (genomic) 

30 • - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: ' ' - 

ATGGCGAACG CGAGCGAGCC GGGTGGCAGC GGCGGCGGCG AGGCGGCCGC CCTGGGCCTC 60 
AAGCTGGCCA CGCTCAGCCT GCTGCTGTGC GTGAGCCTAG CGGGCAACGT GCTGTTCGCG 120 
CTGCTGATCG TGCGGGAGCG CAGCCTGCAC CGCGCCCCGT ACTACCTGCT GCTCGACCTG 180 
TGCCTGGCCG ACGGGCTGCG CGCGCTCGCC TGCCTCCCGG CCGTCATGCT GGCGGCGCGG 240 

35 CGTGCGGCGG CCGCGGCGGG GGCGCCGCCG GGCGCGCTGG GCTGCAAGCT GCTCGCCTTC 300 



:•• WO 00/31258 ; . ^ ^ " •/. : , ;• •"; ' ; 1 PCT/US99/23687. 
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CTGGCCGCGC TCTTCTGCTT CCACGCCGCC TTCCTGCTGC TGGGCGTGGG CGTCACCCGC 360 . . 
: . TACCTGGCCA TCGCGCACCA CCGCTTCTAT GCAGAGCGCC. . TGGCCGGCTG GCCGTGCGCc' 420 ' 
- '.GCCATGCTGG' TGTGCGCCGC CTGGGCGCTG GCGCTGGCCg' CGGCCTTCCC GCCAGTGCTG 480 
. : GACGGCGGTG GCGACGACGA GGACGCGCCG TGCGCCCTGG AGCAGCGGCC CGACGGCGCC 540 
5CCCGGCGCGC TGGGCTTCCT GCTGCTGCTG GCCGTGGTGG TGGGCGCCAC GCACCTCGTC 600 
TACCTCCGCC TGCTCTTCTT CATCCACGAC CGCCGCAAGA TGCGGCCCGC GCGCCTGGTG 660 
CCCGCCGTCA GCCACGACTG GACCTTCCAC GGCCCGGGCG CCACCGGCCA GGCGGCCGCC 72 0 
AACTGGACGG CGGGCTTCGG CCGCGGGCCC ACGCCGCCCG CGCTTGTGGG CATCCGGCCC 780 
GCAGGGCCGG GCCGCGGCGC GCGCCGCCTC CTCGTGCTGG AAGAATTCAA GACGGAGAAG 840 
1 0 AGGCTGTGCA AGATGTTCTA CGCCGTCACG CTGCTCTTCC TGCTCCTCTG GGGGCCCTAC 900 ■ 
GTCGTGGCCA GCTACCTGCG GGTCCTGGTG CGGCCCGGCG CCGTCCCCCA GGCCTACCTG 960 
ACGGCCTCCG TGTGGCTGAC CTTCGCGCAG GCCGGCATCA ACCCCGTCGT GTGCTTCCTC1 020 
TTCAACAGGG AGCTGAGGGA CTGCTTCAGG GCCCAGTTCC CCTGCTGCCA GAGCCCCCGG1080 
" ACCACCCAGG CGACCCATCC CTGCGACCTG . AAAGGCATTG GTTTATGA - 1128 
15(17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

. (ii) MOLECULE TYPE: protein 



20 



25 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ala Asn Ala Ser Glu Pro Gly Gly Ser Gly Gly Gly Glu Ala Ala 

11 ■' . ;• 5 • ■ io " • , . • - 15 

Ala Leu Gly Leu Lys Leu Ala Thr Leu Ser Leu Leu Leu Cys Val Ser 



20 



25 



30 



Leu Ala Gly Asn Val Leu Phe Ala Leu Leu lie Val Arg Glu Arg Ser 
35 40 



45 



Leu His Arg Ala Pro Tyr Tyr Leu Leu Leu Asp Leu Cys Leu Ala. Asp 



55 



60 



Gly Leu Arg Ala Leu Ala Cys Leu. Pro Ala Val Met Leu Ala Ala Arg 



70 



75 



80. 
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Arg Ala Ala Ala Ala Ala Gly Ala Pro Pro Gly Ala Leu Gly Cys Lys 

85 90 95 

Leu Leu Ala Phe Leu Ala Ala Leu Phe Cys Phe His Ala Ala Phe Leu . 
100 105 - HO 

5 Leu Leu Gly Val Gly Val. Thr- Arg Tyr Leu Ala lie Ala His His Arg V 

115 120 125 

Phe Tyr Ala Glu Arg Leu Ala Gly Trp Pro Cys Ala Ala Met Leu Val 
130 135 140. 

Cys Ala Ala Trp Ala Leu Ala Leu Ala Ala Ala Phe Pro Pro Val Leu 
5 145 150 . 155 160 

^ Asp Gly Gly Gly Asp Asp Glu Asp Ala Pro Cys Ala Leu Glu Gin Arg ". 

165 170 175 

Pro Asp Gly Ala Pro Gly Ala Leu Gly Phe Leu Leu Leu Leu Ala Val 
180 185 190 

Val Val Gly Ala Thr His Leu Val Tyr Leu Arg Leu Leu Phe Phe lie 
195 200 205 

His Asp Arg Arg Lys Met Arg Pro Ala Arg Leu Val Pro Ala Val Ser 

210 ,. , 215 220 . ; r ■ 

His Asp Trp Thr Phe His Gly Pro Gly Ala Thr Gly Gin Ala Ala Ala - 
225 230 235 / 240 

Asn Trp Thr Ala Gly Phe Gly Arg Gly Pro Thr Pro Pro Ala Leu Val 
245 250 255 

Gly lie Arg Pro Ala Gly Pro Gly Arg Gly Ala Arg Arg Leu Leu Val 
260 265 270 

Leu Glu Glu Phe Lys Thr Glu Lys Arg Leu Cys Lys Met Phe Tyr Ala 
275 280 285 

Val Thr Leu Leu Phe Leu Leu Leu Trp Gly Pro Tyr Val Val Ala Ser 
290 295 " 300 

Tyr Leu Arg Val Leu Val Arg Pro Gly Ala Val Pro Gin Ala Tyr Leu 
305 310 315 320 

Thr Ala Ser Val Trp Leu Thr Phe Ala Gin Ala Gly lie Asn Pro Val 
325 330 335 

Val Cys Phe Leu Phe Asn Arg Glu Leu" Arg Asp Cys Phe Arg Ala Gin 
340 345 350 

' Phe Pro Cys Cys Gin Ser Pro Arg Thr Thr Gin Ala Thr His Pro Cys 
355 360 365 



Asp Leu Lys Gly lie Gly Leu 
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(18) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS : 

. (A) LENGTH: 1002 base pairs ; ■ • ; . 

5 (B) TYPE: nucleic acid r ;-.r ■ ----- -- J . 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 : ^ 
1 0 ATGAACACCA CAGTGATGCA AGGCTTCAAC AGATCTGAGC GGTGCCCCAG AGACACTCGG 60 
ATAGTACAGC TGGTATTCCC AGCCCTCTAC ACAGTGGTTT TCTTGACCGG CATCCTGCTG 120 
AATACTTTGG CTCTGTGGGT GTTTGTTCAC ATCCCCAGCT CCTCCACCTT CATCATCTAC 180 
CTCAAAAACA CTTTGGTGGC CGACTTGATA ATGACACTCA TGCTTCCTTT CAAAATCCTC 240 
TCTGACTCAC ACCTGGCACC CTGGCAGCTC AGAG CTTTTG TGTGTCGTTT TTCTTCGGTG 300 
• 15 ATATTTTATG AGACCATGTA TGTGGGCATC GTGCTGTTAG GGCTCATAGC CTTTGACAGA 360 
TTCCTCAAGA TCATCAGACC TTTGAGAAAT ATTTTTCTAA AAAAACCTGT TTTTGCAAAA 420 
ACGGTCTCAA TCTTCATCTG GTTCTTTTTG TTCTTCATCT CCCTGCCAAA TACGATCTTG 480 
- AGCAACAAGG AAGCAACACC ATCGTCTGTG AAAAAGTGTG CTTCCTTAAA GGGG CCTCTG 540 

GGGCTGAAAT GGCATCAAAT GGTAAATAAC ATATGCCAGT TTATTTTCTG GACTGTTTTT 600 . 
20 ATCCTAATGC TTGTGTTTTA TGTGGTTATT GCAAAAAAAG TATATGATTC TTATAG AAAG 660 ' 
TCCAAAAGTA AGGACAGAAA AAACAACAAA AAGCTGGAAG GCAAAGTATT TGTTGTCGTG 720 
GCTGTCTTCT TTGTGTGTTT TGCTCCATTT CATTTTGCCA GAGTTCCATA TACTCACAGT .780 
CAAACCAACA ATAAGACTGA CTGTAGACTG CAAAATCAAC TGTTTATTGC TAAAGAAACA 840 
; ACTCTCTTTT TGGCAGCAAC TAACATTTGT ATGGATCCCT TAATATACAT ATTCTTATGT 900 '. 
25 AAAAAATTCA CAGAAAAGCT ACCATGTATG CAAGGGAGAA AGACCACAGC ATCAAGCCAA 960 . 
GAAAATCATA GCAGTCAGAC AGACAACATA ACCTTAGGCT GA . . 1002 

(19) INFORMATION FOR SEQ ID NO: 18 : 

(i) SEQUENCE CHARACTERISTICS:/ 
■ ; " : (A) 'LENGTH : 333 -amino acids' 

j0 (B) TYPE : amino acid 

• (C) STRANDEDNESS: 
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(D) TOPOLOGY: not relevant ; 
(ii) MOLECULE TYPE: protein. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



Met Asn Thr Thr Val, Met Gin Gly Phe Asn Arg Ser Glu Arg Cys Pro 
1 5 10 , 15 

Arg Asp Thr Arg lie Val Gin Leu Val Phe Pro Ala Leu Tyr Thr Val 
20 25 30 

Val Phe Leu Thr Gly lie Leu Leu Asn Thr Leu Ala Leu Trp Val Phe 
35 40 45 

Val His lie Pro Ser Ser Ser Thr Phe lie lie Tyr Leu Lys Asn Thr 
50 53 60 



Leu Val- Ala Asp Leu lie Met Thr Leu Met Leu Pro Phe Lys lie Leu 
65 70 75 80 

Ser Asp Ser His Leu Ala Pro Trp Gin Leu Arg Ala Phe Val Cys Arg 
85 90 95 

Phe Ser Ser Val He Phe Tyr Glu Thr Met Tyr Val Gly He Val Leu. 
100 .105 no 

Leu Gly Leu lie Ala Phe Asp Arg Phe Leu Lys He. He Arg Pro Leu 
115 120 125 

Arg Asn He Phe Leu Lys Lys Pro Val Phe Ala Lys Thr Val Ser He 
130 135 ,140 

Phe lie Trp Phe Phe Leu Phe Phe lie Ser Leu Pro Asn Thr He Leu 
145 150 - 155 . 160 

Ser Asn Lys Glu Ala Thr Pro Ser Ser Val Lys Lys Cys Ala Ser Leu 
165 170 : • ; 175 

Lys Gly Pro Leu Gly, Leu Lys Trp His Gin Met Val Asn Asn He Cys 
180 185 ,190 

Gin Phe He Phe Trp Thr Val Phe He Leu Met Leu Val Phe Tyr Val 

;■ - 195 200 205 

Val He Ala Lys Lys Val Tyr Asp Ser Tyr Arg Lys Ser Lys Ser Lys 
210 215 220 

Asp Arg Lys Asn Asn Lys Lys Leu Glu Gly Lys Val Phe Val Val Val 
225 230 235 240 

Ala Val Phe Phe Val Cys Phe Ala Pro Phe His Phe, Ala Arg Val Pro 
245 250 255 
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Tyr Thr His Ser Gin Thr Asn Asn Lys Thr Asp Cys Arg Leu Gin Asn 

' ' 26 5 . ,-270. ■ 

Gin Leu Phe lie Ala Lys Glu Thr Thr Leu Phe Leu Ala Ala Thr Asn 

" ^ 275 280 - 285 " • - 

He Cys Met Asp Pro Leu lie Tyr lie Phe Leu Cys Lys Lys Phe Thr 



295 



300 



jls LSU Pr ° > S M6t Gln Arg Lys Thr Thr Ala Ser Ser Gin 



10 



305 310 315 

Glu Asn His Ser .Ser Gin Thr Asp Asn He Thr Leu Glv 

• 325 ■ .":.■■> ':■ 330 ■ .■ 

(20) INFORMATION FOR SEQ ID NO: 19: 



320 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1122 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
ATGGCCAACA CTACCGGAGA GCCTGAGGAG GTGAGCGGCG CTCTGTCCCC ACCGTCCGCA 60 
20 TCAGCTTATG TGAAGCTGGT ACTGCTGGGA CTGATTATGT GCGTGAGCCT GGCGGGTAAC 120 1 
GCCATCTTGT CCCTGCTGGT GCTCAAGGAG CGTGCCCTGC ACAAGGCTCC TTACTACTTC 180 
CTGCTGGACC TGTGCCTGGC CGATGGCATA CGCTCTGCCG TCTGCTTCCC CTTTGTGCTG 240 
• GCTTCTGTGC GCCACGGCTC TTCATGGACC TTCAGTGCAC TCAGCTGCAA GATTGTGGCC 300 
TTTATGGCCG TGCTCTTTTG CTTCCATGCG GCCTTCATGC TGTTCTGCAT CAGCGTCACC 360 
25 CGCTACATGG CCATCGCCCA CCACCGCTTC TACGCCAAGC GCATGACACT CTGGACATGC 420 
GCGGCTGTCA TCTGCATGGC CTGGACCCTG TCTGTGGCCA TGGCCTTCCC ACCTGTCTTT 480 
GACGTGGGCA CCTACAAGTT TATTCGGGAG GAGGACCAGT GCATCTTTGA GCATCGCTAC 540 
■ TTCAAGGCCA ATGACACGCT GGGCTTCATG CTTATGTTGG CTGTGCTCAT GGCAGCTACC 600 

CATGCTGTCT ACGGCAAGCT GCTCCTCTTC GAGTATCGTC ACCGCAAGAT GAAGCCAGTG 660 " 
30 CAGATGGTGC CAGCCATCAG CCAGAACTGG ACATTCCATG GTCCCGGGGC CACCGGCCAG 720 
GCTGCTGCCA ACTGGATCGC CGGCTTTGGC CG.TGGGCCCA TGCCACCAAC CCTGCTGGGT 780 
ATCCGGCAGA ATGGGCATGC AGCCAGCCGG CGGCTACTGG GCATGGACGA GGTCAAGGGT 840 
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GAAAAGCAGC TGGGCCGCAT GTTCTACGCG. ATCACACTGC TCTTTCTGCT CCTCTGGTCA 900 
CCCTACATCG TGGCCTGCTA CTGGCGAGTG TTTGTGAAAG CCTGTGCTGT GCCCCACCGC 960 
TACCTGGCCA CTGCTGTTTG GATGAGCTTC GCGCAGGCTG CCGTCAACCC AATTGTCTGC1 020 
TTCCTGCTCA ACAAGGACCT CAAGAAGTGC CTGACCACTC . ACGCCCCCTG CTGGGGCACA1 080 
5 GGAGGTGCCC CGGCTCCCAG AGAACCCTAC TGTGTCATGT GA . ' . 112 2 

(21) INFORMATION FOR SEQ ID NO: 20: ' , ," 

. (i) SEQUENCE CHARACTERISTICS: , . . ; 

(A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: :. " ■ • 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Met Ala Asn Thr Thr Gly Glu Pro Glu Glu Val Ser Gly Ala Leu Ser 

15 1 5 10 is ,. .. 

Pro Pro Ser Ala Ser Ala Tyr Val Lys Leu Val Leu Leu Gly Leu lie 
20 25 30 

Met.Cys Val Ser Leu Ala Gly Asn Ala lie Leu Ser Leu Leu Val Leu 
35 40 45 



20 



25 



30 



35 



Lys Glu Arg Ala Leu His Lys Ala Pro Tyr Tyr Phe Leu Leu. Asp Leu 
50 5 5 60 

Cys Leu Ala Asp Gly lie Arg Ser Ala Val Cys Phe Pro Phe Val Leu 

65 .70 75 .... 80 

Ala Ser Val Arg His Gly Ser Ser Trp Thr Phe Ser Ala Leu Ser Cys 
85 90 . 95 

Lys He Val Ala Phe Met Ala Val Leu Phe Cys Phe His Ala Ala Phe 
. 100 105 . no 

Met Leu Phe Cys He Ser Val Thr Arg Tyr Met Ala .He Ala' His His 
H5 120 125 

Arg Phe Tyr Ala Lys Arg Met Thr Leu Trp Thr Cys Ala Ala Val He 
130 135 140 

Cys Met Ala Trp Thr Leu Ser Val Ala Met Ala Phe Pro Pro Val Phe 
145 150 155 160 

Asp Val Gly Thr Tyr Lys Phe lie Arg Glu Glu Asp Gin Cys He Phe 
• ?" 65 • - .170. .. _ . . 175 
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Glu His Arg Tyr Phe Lys Ala Asn Asp Thr Leu Gly Phe Met Leu Met 



180 185 



190 



Leu Ala val Leu. Met Ala Ala Thr His Ala Val Tyr Gly Lys Leu Leu 

... . . .. i-9s - - ■ - .- - ■;• -.r - -. — , ; 

Leu Phe Glu Tyr Arg His Arg Lys Met Lys Pro Val Gin Met Val Pro 

; 210 - • . 215 . .. . ,. 220 . 

Ala lie Ser Gin Asn Trp Thr Phe His Gly Pro Gly Ala Thr Gly Gin 
225 230 • 235 



240 



10 



Ala Ala! Ala Asn Trp lie Ala Gly Phe Gly Arg Gly Pro Met Pro Pro 



245 



250 



255 



Leu 



15 



20 



, Thr Leu Leu Gly lie Arg Gin Asn Gly His Ala Ala Ser Arq Ara 
. 260 265 270 

Leu Gly Met Asp: Glu Val Lys Gly Glu Lys Gin Leu Gly Arg Met Phe 
275 280 285 

Tyr Ala lie Thr Leu Leu Phe Leu Leu Leu Trp Ser Pro Tyr' lie Val 

- 290 / . : 295 ■ . / . 300 

Ala Cys Tyr Trp Arg Val Phe Val Lys Ala Cys Ala Val Pro His Arg 
. 305 . 310 315 320. 

Tyr Leu Ala Thr Ala Val Trp Met Ser Phe Ala Gin Ala Ala Val Asn 

. 325 . '3.30 .-; 33 5 . 

Pro lie. Val Cys Phe Leu Leu Asn Lys Asp Leu Lys Lys Cys Leu Thr 

340 ' .345. 3 so ■ 

Thr His Ala Pro cys Trp Gly Thr Gly Gly Ala Pro Ala Pro Arg Glu 

.v."- . . 355 . v —.360 ,. ,. 365 . 

Pro Tyr Cys Val Met 

• - 370 ' ■"■ * '-- ■" - ■- 

. (22) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1053 base pairs .- 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single ' 

(D) TOPOLOGY: linear 

• MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 21: 
35 ATGGCTTTGG AACAGAACCA GTCAACAGAT TATTATTATG AGGAAAATGA AATGAATGGC 60 
—^^^^^^ J^^^^?^? ^9T^^^/^9_ _^^T^?J ?AG_ AGAATTT.GCA- 120- 



25 
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AAAGTTTTCC TCCCTGTATT CCTCACAATA GCTTTCGTCA TTGGACTTGC AGGCAATTCC 180 
ATGGTAGTGG CAATTTATGC CTATTACAAG AAACAGAGAA CCAAAACAGA TGTGTACATC 240 
CTGAATTTGG CTGTAGCAGA TTTACTCCTT CTATTCACTC TGCCTTTTTG GGCTGTTAAT 300 
GCAGTTCATG GGTGGGTTTT AGGGAAAATA ATGTGCAAAA TAACTTCAGC CTTGTACACA 360 
5 CTAAACTTTG TCTCTGGAAT GCAGTTTCTG GCTTGCATCA GCATAGACAG ATATGTGGCA 420 
- GTAACTAATG TCCCCAGCCA ATCAGGAGTG GGAAAACCAT GCTGGATCAT CTGTTTCTGT 480 
GTCTGGATGG CTGCCATCTT GCTGAGCATA CCCCAGCTGG TTTTTTATAC AGTAAATGAC 540 
. AATGCTAGGT GCATTCCCAT TTTCGCCCGC TACCTAGGAA CATCAATGAA AGCATTGATT 600 
CAAATG CTAG AGATCTGCAT TGGATTTGTA GTACCCTTTC TTATTATGGG GGTGTGCTAC 660 
10TTTATCACGG CAAGGACACT CATGAAGATG CCAAACATTA AAATATCTCG ACCCCTAAAA 720 
GTTCTGCTCA CAGTCGTTAT AGTTTTCATT GTCACTCAAC TG CCTTATAA CATTGTCAAG 780 
TTCTGCCGAG CCATAGACAT CATCTACTCC CTGATCACCA GCTGCAACAT GAGCAAACGC 840 
ATGGACATCG CCATCCAAGT CACAGAAAGC ATTGCACTCT TTCACAGCTG CCTCAACCCA 900 
ATCCTTTATG TTTTTATGGG AGCATCTTTC AAAAACTACG TTATGAAAGT GGCCAAGAAA 960 
1 5 TATGGGTCCT; GGAGAAGACA GAGACAAAGT GTGGAGGAGT TTCCTTTTGA TTCTGAGGGT1 020 
CCTACAGAGC CAACCAGTAC TTTTAGCATT TAA 1053 
(23) INFORMATION FOR SEQ ID N0:22: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

MOLECULE TYPE: protein ... 
SEQUENCE DESCRIPTION: SEQ" ID NO: 22: 

Ala Leu Glu Gin Asn Gin Ser Thr Asp Tyr Tyr Tyr Glu Glu Asn 
5 10 15 

Met Asn Gly Thr . Tyr Asp Tyr Ser Gin Tyr Glu Leu He Cys He 
20 25 30 

Glu Asp Val Arg Glu Phe Ala Lys Val Phe Leu Pro Val Phe Leu 

35 40 45 ... 

He Ala Phe Val He Gly Leu Ala Gly Asn. Ser Met Val Val Ala 



(i) 

20 

(ii) 
(xi) 

25 Met 
1 

Glu 
Lys 

30 

Thr 
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50 



■ 55 



60 



10 



15 



20 



30 



^ lie Tyr Ala Tyr " Tyr Lys Lys Gin ' Arg Thr Lys Thr Asp Val Tyr lie 
..' ■ ■ ■ 70 ; 75 ■ ■ . ;\ ' 80 

. L?U Asn Ala. Val. Ala Asp Leu . Leu -Leu. Leu Phe Thr Leu Pro . Phe 



85 - r ■ 90 



Trp Ala Val Asn Ala Val His Gly Trp Val Leu Gly Lys lie Met Cys 



105 



110 



Lys lie Thr Ser Ala Leu Tyr Thr Leu Asn Phe Val Ser Gly Me 

■ "0 125 

Phe Leu Ala Cys lie Ser lie Asp Arg ) Tyr Val- Ala Val Thr Asn'val 



-. " S . 140 

Pro Ser Gin ser Gly Val, Gly Lys Pro Cys Trp- He lie Cys Phe Cys 

- 155 160 

He Leu Leu Ser lie Pro Gin j, m vail- ov^ 

165 



Val Trp Met Ala Ala lie Leu Leu Ser lie Pro Gin Leu Val Phe Tyr 
Thr Val Asn Asp Asn Ala Arg Cys lie Pro lie Phe Pro Arg Tyr Leu 

180 las-;.; •■ 190 . 

Gly Thr Ser Met Lys Ala Leu lie Gin- Met Leu Glu lie Cys lie Gly 

I 95 200 



205 



Phe Val Val Pro Phe Leu lie Met Gly Val Cys Tyr Phe lie Thr Ala 

215 ■ 220 V • 

Arg Thr Leu Met Lys Met Pro Asn lie Lys lie Ser Arg Pro Leu Lys 



235 



240 



Val Leu Leu Thr Val Val lie Val Phe lie Val Thr Gin 



25 2 ~ — lie vaj. xnr Gin Leu Pro Tyr 



250 



255 



. Asn lie Val Lys Phe Cys Arg Ala lie Asp lie lie Tyr Ser Leu lie 

. 265 270 
Thr Ser Cys Asn Met Ser Lys Arg Met Asp lie Ala lie Gin Val Thr 

5 ■ ~ ,'• ••. 280 . • 285 ■ 

Glu ser lie Ala Leu Phe His Ser Cys Leu Asn Pro lie Leu Tyr Val 



295 



300 



Phe Met Gly Ala Ser Phe Lys Asn Tyr Val Met Lys Val Ala Lys Lys 



315 



320 



.Tyr Gly Ser Trp Arg Arg Gin Arg Gin Ser- Val Glu Glu Phe Pro Phe 

.• ; ' . 330 :.- '.. 335 ; 

Asp Ser Glu Gly Pro T hr ; Glu. Pro .Thr Ser Thr' Phe Ser lie ' 
340 345. .350 . 
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(24) INFORMATION FOR SEQ ID NO:23: 

. ".. (i) SEQUENCE CHARACTERISTICS : . . , 

(A) LENGTH: 1116 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

■ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
. ATGCCAGGAA ACGCCACCCC AGTGACCACC ACTGCCCCGT GGGCCTCCCT GGGCCTCTCC 60 
10 GCCAAGACCT- GCAACAACGT GTCCTTCGAA GAGAGCAGGA TAGTCCTGGT CGTGGTGTAC 120 
AGCGCGGTGT GCACGCTGGG GGTGCCGGCC AACTGCCTGA CTGCGTGGCT GGCGCTGCTG 180 
CAGGTACTGC AGGGCAACGT GCTGGCCGTC TACCTGCTCT GCCTGGCACT CTGCGAACTG 240 
CTGTACACAG GCACGCTGCC ACTCTGGGTC ATCTATATCC GCAACCAGCA CCGCTGGACC 300 
• CTAGGCCTGC TGGCCTCGAA GGTGACCGCC TACATCTTCT TCTGCAACAT CTACGTCAGC 360 
1 5 ATCCTCTTCC TGTGCTGCAT CTCCTGCGAC CGCTTCGTGG CCGTGGTGTA CGCGCTGGAG 420 
AGTCGGGGCC GCCGCCGCCG GAGGACCGCC ATCCTCATCT CCGCCTGCAT CTTCATCCTC 480 
GTCGGGATCG TTCACTACCC GGTGTTCCAG ACGGAAGACA AGGAGACCTG CTTTGACATG 54 0 . 
CTGCAGATGG ACAGCAGGAT TGCCGGGTAC TACTACGCCA GGTTCACCGT TGGCTTTGCC 600 
ATCCCTCTCT CCATCATCGC CTTCACCAAC CACCGGATTT TCAGGAGCAT CAAGCAGAGC 660 ■ 
20 ATGGGCTTAA GCGCTGCCCA GAAGGCCAAG GTGAAGCACT CGGCCATCGC GGTGGTTGTC 720 
ATCTTCCTAG TCTGCTTCGC CCCGTACCAC CTGGTTCTCC TCGTCAAAGC CGCTGCCTTT 780 ■ 
TCCTACTACA GAGGAGACAG GAACGCCATG TGCGGCTTGG AGGAAAGGCT GTACACAGCC 840 
TCTGTGGTGT TTCTGTGCCT GTCCACGGTG AACGGCGTGG CTGACCCCAT TATCTACGTG 900 
CTGGCCACGG ACCATTCCCG CCAAGAAGTG TCCAGAATCC ATAAGGGGTG GAAAGAGTGG 960 
25TCCATGAAGA CAGACGTCAC CAGGCTCACC ' CACAGCAGGG ACACCGAGGA GCTGCAGTCG 1020 
CCCGTGGCCC TTGCAGACCA CTACACCTTC TCCAGGCCCG TGCACCCACC AGGGTCACCA1 080 
TGCCCTGCAA AGAGGCTGAT TGAGGAGTCC TGCTGA ' lil6 
(25) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS : 
30 (A) LENGTH: 371- amino acids ' 
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•• ' ■ ■ - '■: '-*. -30- • \ : , : 

■ -' ; (B) TYPE : amino acid - ■"•_->' 

;-v> - (C) STRANDEDNESS: ■ • •"' . > ■ ; ,-\ . ' V. ' . 
' ■ (D) TOPOLOGY: not relevant ' ] ' 

~'. ( ii ) MOLECULE TYPE : protein "~ ' ' ' "'. ^ ~ r ■ - ~. 



7 5 : : (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 : ' ^ / 7 

Met Pro Gly Asn Ala Thr Pro Val Thr Thr Thr Ala Pro Trp Ala Ser 
1 5 10 : 15 

Leu Gly Leu Ser Ala Lys Thr Cys Asn Asn Val Ser Phe Glu Glu Ser 
20 25 " 30 

10 Arg lie Val Leu Val Val Val Tyr Ser Ala Val Cys Thr Leu Gly Val 

35 40 45 



Pro Ala Asn Cys Leu Thr Ala Trp Leu Ala Leu Leu Gin Val Leu Gin 
50 55 60 

Gly Asn Val Leu Ala Val Tyr Leu Leu Cys Leu Ala Leu Cys Glu Leu 
15 , 65 70 75 80 

Leu Tyr Thr Gly Thr Leu Pro Leu Trp Val lie Tyr lie Arg Asn Gin 
; 85 90 95 

His Arg Trp Thr Leu Gly Leu Leu Ala Ser Lys Val Thr Ala Tyr lie 
100 105 110 

20 Phe Phe Cys Asn lie Tyr Val Ser lie Leu Phe Leu Cys Cys lie Ser 

115 120 125 

Cys Asp Arg Phe Val Ala Val Val Tyr Ala Leu Glu Ser Arg Gly Arg 
130 135 V" 140 

Arg Arg Arg Arg Thr Ala lie Leu lie Ser Ala Cys lie Phe He Leu 
25 . 145 150 . 155 \ 160 

Val Gly lie Val His Tyr Pro Val Phe Gin Thr Glu Asp Lys Glu Thr 
, ' 165 170 . 175 

Cys Phe Asp Met Leu Gin Met Asp Ser. Arg He Ala Gly Tyr Tyr Tyr 
180 185 . 190 

30 Ala Arg Phe Thr Val Gly Phe Ala lie Pro. Leu Ser lie lie Ala Phe 

195 * . 200 . ' ' 205 v . 

Thr Asn His Arg He Phe Arg Ser lie Lys Gin Ser Met Gly. Leu Ser 
210. 215 220 



Ala Ala Gin Lys Ala Lys Val Lys His Ser Ala lie Ala Val Val Val 
35 -225 230 235 ' 240 
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lie Phe Leu Val Cys Phe Ala Pro Tyr His Leu Val Leu Leii Val Lys 
245 250 255 . 

Ala Ala Ala Phe Ser Tyr Tyr Arg Gly Asp Arg Asn Ala Met Cys Gly 
260 265 270 

5 Leu Glu Glu Arg Leu Tyr Thr Ala Ser Val Val Phe Leu Cys Leu. Ser 

275 280 285 

Thr Val Asn Gly Val Ala Asp Pro He He Tyr Val Leu Ala Thr Asp 
290 295 300 

. His Ser Arg Gin Glu Val Ser Arg lie His Lys Gly Trp Lys Glu Trp - 
10 305 x ■ . 310 315 320 : 

Ser Met Lys Thr Asp Val Thr Arg Leu Thr His Ser Arg Asp Thr Glu 
325 330 335 

Glu Leu Gin Ser Pro Val Ala Leu Ala Asp His Tyr. Thr Phe Ser Arg 
340 • 345 350 

15 Pro Val His Pro Pro Gly Ser Pro Cys Pro Ala Lys Arg Leu He Glu. " 

355 ~ 360 ; 365 

Glu Ser Cys 
370 

(26) INFORMATION FOR SEQ ID NO:25: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1113 base pairs 
"(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single . 

(D) TOPOLOGY: linear 

25 ' (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ IDN0:25: 
ATGGCGAACT ATAGCCATG C AG CTG ACAAC ATTTTGCAAA ATCTCTCGCC TCTAACAGCC 60 
TTTCTGAAAC TGACTTCCTT GGGTTTCATA - ATAGGAGTCA GCGTGGTGGG CAACCTCCTG 120 
ATCTCCATTT TGCTAGTGAA AGATAAGACC TTGCATAGAG CACCTTACTA CTTCCTGTTG 180 
30 GATCTTTGCT GTTCAGATAT CCTCAGATCT. GCAATTTGTT TCCCATTTGT GTTCAACTCT 240 
GTCAAAAATG GCTCTACCTG GACTTATGGG ACTCTGACTT GCAAAGTGAT TGCCTTTCTG 300 
GGGGTTTTGT CCTGTTTCCA CACTGCTTTC ATGCTCTTCT GCATCAGTGT, CACCAGATAC 360 
TTAGCTATCG CCCATCACCG CTTCTATACA AAGAGGCTGA CCTTTTGGAC GTGTCTGGCT 420 
GTGATCTGTA TGGTGTGGAC TCTGTCTGTG GCCATGGCAT TTCCCCCGGT TTTAGACGTG 480 
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GGCACTTACT CATTCATTAG GGAGGAAGAT CAATGCACCT TCCAACACCG CTCCTTCAGG 540 

' GCTAATGATT CCTTAGGATT TATGCTGCTT CTTGCTCTCA TCCTCCTAGC CACACAGCTT 600 

GTCTACCTCA AGCTGATATT TTTCGTCCAC GATCGAAGAA AAATGAAGCC AGTCCAGTTT 660 ' 

■■ ' GTAGCAGCAG TCAGCCAGAA CTGGACTTff CATGGTCCTC GAGCCAGTGG CCAGGCAGCT 720 " 

5 GCCAATTGGC TAGCAGGATT TGGAAGGGGT CCCACACCAC CCACCTTGCT. GGGCATCAGG 78 0 

CAAAATGCAA ACACCACAGG CAG AAGAAGG CTATTGGTCT TAGACGAGTT CAAAATGGAG 840 

AAAAGAATCA GCAGAATGTT CTATATAATG ACTTTTCTGT TTCTAACCTT GTGGGGCCCC 900 

TACCTGGTGG CCTGTTATTG . GAGAGTTTTT GCAAGAGGGC CTGTAGTACC AGGGGGATTT 960 

CTAACAGCTG CTGTCTGGAT GAGTTTTGCC CAAGCAGGAA TCAATCCTTT TGTCTGCATT1 0 2 0 '. 

10 TTCTCAAACA GGGAGCTGAG GCGCTGTTTC AGCACAACCC TTCTTTACTG CAGAAAATCC1 0 8 0 

AGGTTACCAA GGGAACCTTA CTGTGTTATA TGA 1113 

(27) INFORMATION FOR SEQ ID NO: 26: 

. (i) SEQUENCE CHARACTERISTICS: " 

(A) LENGTH: 370 amino acids ■ 

15 : (B) TYPE: amino acid • 

(C) S TRANDEDNES S : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 7 . 



20 



25 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Met Ala Asn Tyr Ser His Ala Ala Asp Asn lie Leu Gin Asn Leu Ser 

' 5 '■ 10 . . 15 . 

Pro Leu ThrAla Phe Leu Lys Leu Thr Ser Leu Gly Phe lie lie Glv 

,;. 20 25 • 30 , 

Val ser Val Val Gly Asn Leu Leu lie Ser lie Leu Leu Val Lys Asp 
■ 35 40 45 ■ , • 



Lys Thr Leu His Arg. Ala Pro Tyr Tyr Phe Leu Leu Asp Leu Cys Cys 
50 55 60 

Ser Asp lie Leu Arg Ser Ala He Cys Phe Pro Phe Val Phe Asn Ser 

:■- 70 . ■ ^ 75 80 

Val Lys Asn Gly Ser Thr Trp Thr Tyr Gly Thr Leu Thr Cys Lys Val 

..... • ■ . .. ,85; . ;' 90 • .... . 95 ' 

He Ala Phe Leu Gly Val Leu Ser Cys Phe His Thr Ala Phe Met Leu 
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100 105 no 

Phe Cys He Ser.Val Thr Arg Tyr Leu Ala He Ala His His Arg Phe 
115 ' " 120 125 

Tyr Thr Lys Arg Leu Thr Phe Trp Thr Cys Leu Ala Val lie Cys Met 
130 135 140 

Val Trp Thr Leu Ser Val Ala Met Ala Phe Pro Pro Val Leu Asp Val 
145 150 . 155 160 

Gly Thr Tyr Ser Phe lie Arg Glu Glu Asp Gin Cys Thr Phe Gin His 
165 170 175 

Arg Ser Phe Arg Ala Asn Asp Ser Leu Gly Phe Met Leu Leu Leu Ala 
180 185 190 

Leu: lie Leu Leu Ala Thr Gin Leu Val Tyr Leu Lys Leu lie Phe Phe 
1^5 ^ 200 205 

Val His Asp Arg Arg Lys Met Lys Pro Val Gin Phe Val Ala Ala Val 
15 210 215 220 

Ser Gin Asn Trp Thr Phe His Gly Pro Gly Ala Ser Gly Gin Ala Ala 
225 230 235 240 

Ala Asn Trp Leu Ala Gly Phe Gly Arg Gly Pro Thr Pro Pro Thr Leu 
245 250 255 

20 Leu Gly He Arg Gin Asn Ala Asn Thr Thr Gly Arg Arg Arg Leu Leu 

260 265 270 

Val Leu Asp Glu Phe Lys Met Glu Lys Arg He Ser Arg Met Phe Tyr 

275 280 .. .. 285 

He Met Thr Phe Leu Phe Leu Thr Leu Trp Gly Pro Tyr Leu Val Ala 
25 * 290 295 300 

Cys Tyr Trp Arg Val Phe. Ala Arg Gly Pro. Val Val Pro Gly Gly Phe 
. 305 310 315 320 

Leu Thr Ala Ala Val Trp Met Ser Phe Ala Gin Ala Gly He Asn Pro 

' * 325 — "• 330 335 

30 Phe Val Cys He Phe Ser Asn Arg Glu Leu Arg Arg Cys Phe Ser Thr 

340 345 350 

Thr Leu Leu Tyr Cys Arg Lys Ser Arg Leu Pro Arg Glu Pro Tyr Cys 
355 360 365 

Val He 
35 370 

(28) INFORMATION FOR SEQ ID N0:27: 
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U) SEQUENCE CHARACTERISTICS: ' " . 

" (A) LENGTH: 1080 base pairs . 
• (B) TYPE: nucleic acid 
' (C) STRANDEDNESS: single 

. . 5 (D) TOPOLOGY: linear • • 

" MOLECULE TYPE: DNA (genomic) * . ' ■" V" ~ ~" ]"' : " ' 

;'■ • . (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:27 : ' ^ " '" : , • 

ATGCAGGTCC CGAACAGCAC CGGCCCGGAC AACGCGACGC TGCAGATGCT GCGGAACCCG 60 : ' 
; GCGATCGCGG TGGCCCTGCC CGTGGTGTAC TCGCTGGTGG CGGCGGTCAG CATCCCGGGC 120 
AACCTCTTCT CTCTGTGGGT GCTGTGCCGG CGCATGGGGC CCAGATCCCC GTCGGTCATC 180 ^ 
TTCATGATCA ACCTGAGCGT CACGGACCTG ATGCTGGCCA GCGTGTTGCC TTTCCAAATC 240 
; : TACTAC C ATT GCAACCGCCA CCACTGGGTA TTCGGGGTGC TGCTTTGCAA CGTGGTGACC 300 ' ^ 
GTGGCCTTTT ACGCAAACAT GTATTCCAGC ATCCTCACCA TGACCTGTAT CAGCGTGGAG 360 ' 
CGCTTCCTGG GGGTCCTGTA CCCGCTCAGC TCCAAGCGCT GGCGCCGCCG TCGTTACGCG 420 
15 GTGGCCGCGT GTGCAGGGAC CTGGCTGCTG CTCCTGACCG CCCTGTGCCC GCTGGCGCGC 480 
ACCGATCTCA CCTACCCGGT GCACGCCCTG GGCATCATCA CCTGCTTCGA CGTCCTCAAG 540 
; TGGACGATGC TCCCCAGCGT GGCCATGTGG GCCGTGTTCC TCTTCACCAT CTTCATCCTG 600 
CTGTTCCTCA TCCCGTTCGT GATCACCGTG GCTTGTTACA CGGCCACCAT.CCTCAAGCTG 660 
TTGCGCACGG AGGAGGCGCA CGGCCGGGAG CAGCGGAGGC GCGCGGTGGG CCTGGCCGCG 720 ' 
20 GTGGTCTTGC TGGCCTTTGT CACCTGCTTC GCCCCCAACA ACTTCGTGCT CCTGGCGCAC 780 
ATCGTGAGCC GCCTGTTCTA CGGCAAGAGC TACTACCACG TGTACAAGCT CACGCTGTGT 840 
.CTCAGCTGCC TCAACAACTG TCTGGACCCG TTTGTTTATT ACTTTGCGTC CCGGGAATTC 900 
■ . CAGCTGCGCC TGCGGGAATA TTTGGGCTGC CGCCGGGTGC CCAGAGACAC CCTGGACACG 960 • ■ : 
CGCCGCGAGA GCCTCTTCTC CGCCAGGACC ACGTCCGTGC GCTCCGAGGC CGGTGCGCAC1 02 0 ' 
25 CCTGAAGGGA TGGAGGGAGC CACCAGGCCC GGCCTCCAGA GGCAGGAGAG TGTGTTCTGA1 080 ' . 
(29), INFORMATION FOR SEQ ID NO: 28: ' , • \ 

'. (i) SEQUENCE CHARACTERISTICS: 

- (A) LENGTH: 359 amino acids 

(B) TYPE: amino acid . . . : . 

30 ' • (C) STRANDEDNESS: ' ". ' : ' " '. 

' (D) TOPOLOGY.: not relevant 
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(ii) - MOLECULE TYPE: protein 



* (xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 28: 

Met Gln Val Pro Asn Ser Thr Gly Pro Asp' Asn Ala Thr Xeu Gin Met 

■ * . 1 -, ... - .' ■ ? ■ * ■ 10 ■ • ■ ; is 

5 - Leu Arg Asn Pro Ala lie. Ala Val Ala. Leu .Pro Val Val Tyr Ser Leu 

20 'v ; ?5 • . : 30 

Val Ala Ala Val Ser He Pro Gly Asn Leu Phe Ser Leu Trp Val Leu 

' . 35 . 40 '.. . ' V 45 : 

c y s Ajr 9 ^rg Met Gly Pro Arg Ser Pro Ser Val lie Phe Met lie Asn 

0 - . 5 ° : - \ 55 r 60 ; \: / 

. / Leu Ser Val Thr . Asp Leu Met Leu Ala Ser Val Leu Pro Phe Gin He 

V 65 ■ v 70 ^ 75 80 

Tyr Tyr His Cys Asn Arg His His Trp Val Phe Gly Val Leu Leu Cys 

85 • . 90 . . 95 . ... 

■ Asn Val Val Thr. Val Ala Phe Tyr Ala Asn Met Tyr Ser Ser lie Leu 

■ \ ; v / 100 , ~ y los /• ; . ; no 

,Thr-Met Thr Cys lie Ser Val Glu Arg , Phe Leu Gly Val Leu Tyr Pro 
115 120 125 ; 

Leu Ser Ser Lys Arg Trp Arg Arg Arg Arg Tyr Ala Val . Ala Ala Cys 
130 135 140 ' 

* Ala- Gly Thr Trp Leu Leu Leu Leu Thr Ala Leu Cys Pro Leu Ala Arg 

145 - ^ / 150 : 155 ' : . ' 160 

Thr Asp Leu Thr Tyr Pro Val His Ala Leu Gly lie lie Thr Cys Phe 

165 . ■; • 170 . 175 

" Asp Val Leu Lys Trp" Thr 'Met Leu' Pro. Ser Val Ala Met Trp Ala Val' 
180 _ 185 190 

Phe. Leu Phe Thr lie Phe lie Leu Leu Phe Leu lie Pro Phe Val lie 
195 ' 200 ' • 205 

Thr Val ' Ala- Cys Tyr Thr Ala Thr - lie Leu Lys Leu Leu Arg Thr Glu 
210 215 / 220 .< ■ - 

Glu Ala His Gly -Arg Glu Gin Arg Arg Arg Ala Val Gly Leu Ala Ala 
225 230 ; 235 240 

\ Val Val Leu Leu Ala Phe Val Thr Cys Phe Ala Pro Asn Asn Phe Val 

245 v \ , • 250 • •/' , 255 

Leu Leu Ala His He Val Ser Arg Leu Phe Tyr . Gly Lys Ser Tyr Tyr 
260 .265 \ \ * • 270 
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His Val -Tyr Lys Leu Thr Leu Cys Leu. Ser Cys Leu Asn Asn Cys Leu 
.275 . ' . . 280 285. ■•. 

I ,C _ Asp .Pro_ Phe. Val Tyr-.Tyr . Phe -Ala Ser Arg Glu - Phe Gin 'Leu Arg Leu 
'' : ■■ 290 ' ;. 295 30 0 

/• 5 Arg Glu Tyr Leu Gly Cys Arg Arg Val Pro Arg Asp Thr Leu Asp Thr 

305 310 315 : . 

Arg Arg Glu Ser Leu Phe Ser Ala Arg Thr Thr Ser Val Arg Ser Glu 
:■' , 325 330. 335 

Ala G1 y ^ His Pro Glu Gly Met Glu Gly Ala Thr Arg Pro Gly Leu 

10 _ 340 . . . • ; '•■ 345. ;. . • 350 

Gin Arg Gin Glu Ser Val Phe '. 

'355 - V ■ . _ : - ; 1 "' : . . ... : 

(30) INFORMATION FOR SEQ ID NO: 29: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1503 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: ' single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATGGAGCGTC CCTGGGAGGA CAGCCCAGGC CCGGAGGGGG CAGCTGAGGG CTCGCCTGTG ,60 
CCAGTCGCCG CCGGGGCGCG CTCCGGTGCC GCGGCGAGTG GCACAGGCTG GCAGCCATGG 120 
♦ GCTGAGTGCC CGGGACCCAA GGGGAGGGGG CAACTGCTGG CGACCGCCGG CCCTTTGCGT 180 
CGCTGGCCCG CCCCCTCGCC TGCCAGCTCC AGCCCCGCCC CCGGAGCGGC GTCCGCTCAC 240 
25 TCGGTTCAAG GCAGCGCGAC TGCGGGTGGC GCACGACCAG GGCGCAGACC TTGGGGCGCG 300 
CGGCCCATGG AGTCGGGGCT GCTGCGGCCG GCGCCGGTGA GCGAGGTCAT CGTCCTGCAT 360 
TACAACTACA CCGGCAAGCT CCGCGGTGCG AGCTACCAGC .CGGGTGCCGG CCTGCGCGCC. 420 
GACGCCGTGG TGTGCCTGGC GGTGTGCGCC TTCATCGTGC TAGAGAATCT AGCCGTGTTG 480 
TTGGTGCTCG GACGCCACCC GCGCTTCCAC GCTCCCATGT TCCTGCTCCT GGGCAGCCTC 540 
30 ACGTTGTCGG ATCTGCTGGC AGGCGCCGCC TACGCCGCCA ACATCCTACT GTCGGGGCCG 600 
CTCACGCTGA AACTGTCCCC CGCGCTCTGG TTCGCACGGG AGGGAGGCGT CTTCGTGGCA 660 
CTCACTGCGT CCGTGCTGAG CCTCCTGGCC ATCGCGCTGG AGCGCAGCCT CACCATGGCG 720 
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CGCAGGGGGC CCGCGCCCGT CTCCAGTCGG GGGCGCACGC TGGCGATGGC AGCCGCGGCC 780 
TGGGGCGTGT CGCTGCTCCT CGGGCTCCTG CCAGCGCTGG GCTGGAATTG CCTGGGTCGC 840 
CTGGACGCTT GCTCCACTGT CTTGCCGCTC TACGCCAAGG CCTACGTGCT CTTCTGCGTG 900 
CTCGCCTTCG TGGGCATCCT GGCCGCGATC TGTGCACTCT ACGCGCGCAT CTACTGCGAG 960 
5 GTACG CGCCA ACGCGCGGCG CCTGCCGGCA CGGCCCGGGA CTGCGGGGAC CACCTCGACC1 02 0 
GGGGCGCGTC GCAAGCCGCG CTCTCTGGCC TTGCTGCGCA CGCTCAGCGT GGTGCTCCTG1080 
GCCTTTGTGG CATGTTGGGG CCCCCTCTTC CTGCTGCTGT TGCTCGACGT GGCGTGCCCG1140 " 
GCGCGCACCT GTCCTGTACT CCTGCAGGCC GATCCCTTCC TGGGACTGGC CATG G CCAAC1 200 
TCACTTCTGA ACGCCATCAT CTACACGCTC ACCAACCGCG ACCTGCGCCA CGCGCTCCTG1260 
10CGCCTGGTCT GCTGCGGACG CCACTCCTGC GGCAGAGACC . CGAGTGGCTC CCAGCAGTCG1320 
GCGAGCGCGG CTGAGGCTTC CGGGGGCCTG CGCCGCTGCC. TGCCCCCGGG CCTTGATGGG1 380 
AGCTTCAGCG GCTCGGAGCG CTCATCGCCC CAGCGCGACG GGCTGGACAC CAGCGGCTCC1 440 
ACAGGCAGCC CCGGTGCACC CACAGCCGCC CGGACTCTGG TATCAGAACC GGCTGCAGAC1500 ' • 
TGA ' V 1503 

15(31) INFORMATION FOR SEQ ID NO: 30: 

..(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

20 (D) TOPOLOGY: not relevant 

. (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

Met Glu Arg Pro Trp Glu Asp Ser Pro Gly. Pro Glu. Gly Ala Ala Glu 

• 1 • 5 ■ - - . 10'/' is 

25 Gly Ser Pro Val Pro Val Ala Ala Gly Ala Arg Ser' Gly Ala Ala Ala 

20 25 30 

Ser Gly Thr Gly Trp Gin Pro Trp Ala Glu Cys Pro Gly Pro Lys Gly 
35 40 45 

Arg Gly Gin Leu Leu Ala Thr Ala Gly Pro Leu Arg Arg Trp Pro Ala 
30 50 55 60 

Pro Ser Pro Ala Ser Ser Ser Pro Ala Pro Gly Ala Ala Ser Ala His 
65 V 70 - 75 80 



- Ser Val Gin Gly Ser Ala Thr Ala Gly Gly Ala Arg Pro Gly Arg Arg 
85 90 95 

Pro. Trp Gly Ala Arg Pro Met Glu Ser Gly Leu Leu Arg . Pro Ala Pro 
100 105 , no 

Val Ser Glu Val lie Val Leu His Tyr Asn Tyr Thr Gly Lys Leu Arg 
115 120 , 125 

Gly Ala Ser Tyr Gin Pro Gly Ala Gly Leu Arg Ala Asp Ala Val Val 
130 135 . ; 140 

Cys Leu Ala Val Cys Ala Phe He Val Leu Glu. Asn Leu Ala Val Leu 
145 150 . 155 . / . 160 

Leu Val Leu Gly Arg His Pro. Arg Phe His Ala. Pro Met Phe Leu Leu 
165 170 175 

Leu Gly Ser Leu Thr Leu Ser Asp Leu Leu Ala Gly Ala Ala tyr Ala 
180 185 -' . 190 

Ala Asn lie Leu Leu Ser Gly Pro Leu Thr Leu Lys Leu Ser Pro Ala 
' 195 200 205 

Leu Trp Phe Ala Arg Glu Gly Gly Val Phe Val Ala Leu Thr Ala Ser 
210 215 . 220 

Val Leu Ser Leu Leu Ala lie Ala Leu Glu Arg Ser Leu 'Thr Met Ala 
225 230 235 240 

Arg Arg Gly Pro Ala Pro Val Ser Ser Arg Gly Arg Thr Leu Ala Met 
245 ■ . . 250 255 

Ala Ala Ala Ala Trp Gly Val Ser Leu Leu Leu Gly, Leu Leu Pro Ala 
260 265 270 

Leu Gly Trp Asn Cys Leu Gly. Arg Leu Asp Ala Cys Ser Thr Val Leu 
275 280 285 

Pro Leu Tyr Ala Lys Ala Tyr Val Leu Phe Cys Val Leu Ala Phe Val 
290 295 300 

• Gly lie Leu Ala Ala He Cys Ala Leu Tyr Ala Arg lie Tyr Cys Gin 
305 ' .310 / 315 320 

Val Arg Ala Asn Ala Arg Arg Leu Pro Ala Arg Pro Gly Thr Ala Gly 
325 330 335 

Thr Thr Ser Thr Arg Ala Arg Arg Lys Pro Arg Ser Leu Ala Leu Leu 
. 340 345 ;' . 350 

Arg Thr Leu Ser Val Val Leu Leu Ala Phe Val Ala Cys Trp Gly Pro 

355 . . 360 •- • - : ./••' — - . 365 

Leu Phe Leu Leu Leu Leu Leu Asp Val Ala Cys Pro Ala Arg Thr Cys 
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370 375 " ■ ' . - 380 

Pro Val Leu Leu Gin Ala Asp Pro Phe Leu Gly Leu Ala Met Ala Asn 
385 390 395 400 

* - Ser Leu Leu Asn Pro He He Tyr Thr Leu Thr Asn Arg Asp Leu Arg 

- V - 405 ' , . 410 . : . 415 

His Ala Leu Leii Arg Leu Val Cys Cys Gly Arg His Ser Cys Gly Arg 
" 420 425 430 

Asp Pro Ser Gly. Ser Gin Gin Ser Ala Ser Ala Ala Glu Ala Ser Gly 

435 ' . • 440 . . . • ,445 ; 

Gly Leu Arg Arg Cys Leu Pro Pro Gly Leu Asp Gly Ser Phe Ser Gly 
450 ' 455 460 * 

Ser Glu Arg Ser Ser Pro Gin Arg Asp Gly Leu Asp Thr Ser Gly Ser 
465 470 475 .... 480 

Thr Gly Ser Pro Gly Ala Pro Thr Ala Ala Arg Thr Leu Val Ser Glu 
15 485 490 495 

Pro Ala Ala Asp 
500 

(32) INFORMATION FOR SEQ ID NO: 31: 



10 



20 



(i) SEQUENCE CHARACTERISTICS: \ 

(A) LENGTH: 1029 base pairs 

(B) TYPE: nucleic acid 

(C) S TRANDEDNE S S : single * 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) ' 

25 (xi) SEQUENCE DESCRIPTION: SEQ IDNO:31: 

ATGCAAGCCG TCGACAATCT CACCTCTGCG CCTGGGAACA CCAGTCTGTG CACCAGAGAC 60 
TACAAAATCA CCCAGGTCCT CTTCCCACTG CTCTACACTG TCCTGTTTTT TGTTGGACTT 120 
ATCACAAATG GCCTGGCGAT GAGGATTTTC TTTCAAATCC GGAGTAAATC AAACTTTATT 180 
ATTTTTCTTA AGAACACAGT CATTTCTGAT CTTCTCATGA TTCTGACTTT TCCATTCAAA 240 

30 ATTCTTAGTG ATGCCAAACT GGGAACAGGA CCACTGAGAA CTTTTGTGTG TCAAGTTACC 300 
TCCGTCATAT TTTATTTCAC AATGTATATC AGTATTTCAT TCCTGGGACT GATAACTATC 360 
GATCGCTACC AGAAGACCAC CAGGCCATTT AAAACATCCA ACCCCAAAAA TCTCTTGGGG 420 
GCTAAGATTC TCTCTGTTGT CATCTGGGCA TTCATGTTCT TACTCTCTTT GCCTAACATG 480 
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. ' ATTCTGACCA ACAGGCAGCC GAGAGACAAG AATGTGAAGA AATGCTCTTT CCTTAAATCA 540.. 

- ^^^^.^ . : ^TTACATCT GTCA^Q"p^r[. ^j^Qg^^ ggg 

• AATTTCTTAA TTGTTATTGT ATGTTATACA CTCATTACAA AAGAACTGTA CCGGTCATAC 660 
' GTAAGAACGA GGGGTGTAGG TAAAGTCCCC AGGAAAAAGG TGAACGTCAA AGTTTTCATT 720 
' 5 ATCATTGCTG TATTCTTTAT TTGTTTTGTT CCTTTCCATT TTGCCCGAAT TCCTTACACC 780 : 
CTGAGCCAAA CCCGGGATGT CTTTGACTGC ACTGCTGAAA ATACTCTGTT CTATGTGAAA 840 2 
' GAGAGCACTC TGTGGTTAAC TTCCTTAAAT- GCATGCCTGG ATCCGTTCAT CTATTTTTTC 900 
CTTTGCAAGT CCTTCAGAAA TTCCTTGATA AGTATGCTGA AGTGCCCCAA TTCTGCAACA 960 
TCTCTGTCCC AGGACAATAG GAAAAAAGAA CAGGATGGTG GTGACCCAAA TGAAGAGACT1 02 6 
.10 CCAATGTAA 

■ ■'• 1029 

(33); INFORMATION FOR SEQ ID NO: 32 : 

(i) SEQUENCE CHARACTERISTICS: 
: (A). LENGTH: 342 amino acids 
(B) TYPE: amino acid - 
15 . . . (C) STRANDEDNESS : " . 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

, (Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 32 : . ' \ 
20 r MaVal f PASn Leu ^ Ala Pro Gly Asn Thr . Ser Leu ' 

■ :'. " .-■ ■ .'. ' ■'.■*,■' 10 ■ . - : 

.. Cys Thr Arg Asp Tyr Lys He Thr Gin Val Leu Phe Pro Leu Leu Tyr ■ 

' ; . 25 - '" 30 V 

■ Thr val Leu Phe Phe Val Gly Leu He. Thr Asn Gly Leu Ala Met Arg 

■ ' ; ■' 40 ■ ■■■■ 45 . .// , . /■ 

lie Phe Phe Gin lie. Arg ser Lys Ser Asn Ph^ lie xie Phe Leu Lys 

■ 55 ; •■ .. - . - 60 ■ • 

Asn Thr Val.Ile ser Asp Leu Leu Met lie Leu Thr Phe Pro Phe Lys 

■■■■ : 75 . ' : bo 

He Leu ser Asp Ala Lys Leu Gly Thr Gly Pro Leu Arg Thr Phe Val - 

• ; : . ■ 85 90 . . 95 • • 

Cys Gin val Thr ser, val lie Phe Tyr Phe Thr Met Tyr .lie Ser ile " 
rSer-Phe-L^u-Gly Le^i le: ^ 



25 



30 
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115 .120 125 

Pro Phe Lys Thr Ser Asn Pro Lys Asn Leu Leu Gly Ala Lys "lie Leu 
130 135 140 

Ser Val Val lie Trp Ala Phe Met Phe Leu Leu Ser Leu Pro Ash Met 

145 . ■ 150 - . ,- 155 • . - . , . .. 160 

He Leu Thr Asn Arg Gin Pro Arg Asp Lys Asn Val Lys Lys Cys Ser 
165 170 175 

Phe Leu Lys Ser Glu Phe Gly Leu Val Trp His Glu lie Val Asn Tyr 
180 185 • 190 

He Cys Gin Val lie Phe Trp He Asn Phe Leu He Val He - Val Cys 
' 195 200 205 

Tyr Thr Leu lie Thr Lys Glu Leu Tyr Arg Ser Tyr Val Arg Thr Arq 
210 215 220 

Gly Val Gly Lys Val Pro Arg Lys Lys Val Asn Val Lys Val Phe He 
15 225 230 235 240 

He He Ala Val Phe Phe He Cys Phe Val Pro Phe His Phe Ala Arg . 

245 250 • • 255 

He Pro Tyr Thr Leu Ser Gin Thr Arg Asp Val Phe Asp Cys Thr Ala 
260 265 270 

20 Glu Asn Thr Leu Phe Tyr Val Lys Glu Ser Thr Leu Trp Leu Thr Ser 

275 280 285 

Leu Asn Ala Cys Leu Asp Pro Phe lie Tyr Phe Phe Leu Cys Lys Ser 
290 295 300 

Phe Arg Asn Ser Leu He Ser Met Leu Lys Cys Pro Asn Ser Ala Thr 
15 305 310 315 . 320 

Ser Leu Ser Gin Asp Asn Arg Lys Lys Glu Gin Asp Gly Gly Asp Pro 
325 330 335 

Asn Glu Glu Thr Pro Met ' 
340 

0 (34) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1077 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) "' 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-33: 
ATGTCGGTCT GCTACCGTCC CCCAGGGAAC GAGACACTGC TGAGCTGGAA GACTTCGCGG 60 
, GCCACAGGCA CAGCCTTCCT GCTGCTGGCG GCGCTGCTGG GGCTGCCTGG CAACGGCTTC 120 
, - ; GTGGTGTGGA- -GCTTGGCGGG CTGGCGGCCT GCACGGGGGC GACCGCTGGC GGCCACGCTt' 180 ' 
5 GTGCTGCACC TGGCGCTGGC CGACGGCGCG GTGCTGCTGC TCACGCCGCT. CTTTGTGGCC 240 
TTCCTGACCC GGCAGGCCTG GCCGCTGGGC CAGGCGGGCT GCAAGGCGGT GTACTACGTG 300 
TGCGCGCTCA GCATGTACGC .CAGCGTGCTG CTCACCGGCC TGCTCAGCCT GCAGCGCTGG 360 
. ' CTCGCAGTCA CCCGCCCCTT CCTGGCGCCT CGGCTGCGCA GCCCGGCCCT GGCCCGCCGC 420 
CTGCTGCTGG CGGTCTGGCT GGCCGCCCTG TTGCTCGCCG TCCCGGCCGC CGTCTACCGC 480 
10CACCTGTGGA GGGACCGCGT ATGCCAGCTG TGCCACCCGT CGCCGGTCCA CGCCGCCGCC 540 
CACCTGAGCC TGGAGACTCT GACCGCTTTC GTGCTTCCTT TCGGGCTGAT GCTCGGCTGC .600 
TACAGCGTGA CGCTGGCACG GCTGCGGGGC GCCCGCTGGG GCTCCGGGCG GCACGGGGCG 660 . 
CGGGTGGGCC GGCTGGTGAG CGCCATCGTG CTTGCCTTCG GCTTGCTCTG GGCCCCCTAC 720 
CACGCAGTCA ACCTTCTGCA GGCGGTCGCA GCGCTGGCTC CACCGGAAGG GGCCTTGGCG 780 
15AAGCTGGGCG GAGCCGGCCA GGCGGCGCGA GCGGGAACTA CGGCCTTGGC CTTCTTCAGT 840 
TCTAGCGTCA ACCCGGTGCT CTACGTCTTC ACCGCTGGAG ATCTGCTGCC CCGGGCAGGT 900 
CCCCGTTTCC TCACGCCGCT CTTCGAAGGC TCTGGGGAGG CCCGAGGGGG CGGCCGCTCT 960 - 
. AGGGAAGGGA CCATGGAGCT CCGAACTACC CCTCAGCTGA AAGTGGTGGG GCAGGGCCGC1020 
GG CAATGG AG ACCCGGGGGG TGGGATGGAG AAGGACGGTC CGGAATGGGA CCTTTGA. .1077 
20 (35) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 358 amino acids r 

(B) TYPE: amino acid 

(C) STRANDEDNESS: . ' , '.. .' .- ... 

25 (D) TOPOLOGY: not relevant ' " 

. (ii) MOLECULE TYPE: protein . . , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Ser Val Cys Tyr Arg Pro Pro Gly Asn Glu Thr Leu Leu Ser Trp 

■• r - ■ ..V J;' -.V. 5 .:.. , . , .v.-: -IP;.... ■ 15 ,-. ; 

Lys Thr Ser Arg Ala Thr Gly Thr Ala Phe Leu Leu Leu Ala Ala Leu 
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20 25 \ ; . 30 

Leu Gly Leu Pro Gly Asn Gly Phe Val Val Trp Ser Leu Ala Gly Trp 
35 40 45 

Arg Pro Ala Arg Gly Arg Pro Leu Ala Ala Thr Leu Val Leu His Leu 
50 .55 .60 

Ala Leu Ala Asp Gly Ala Val Leu Leu Leu Thr Pro Leu Phe Val Ala 
65 70 75 80 

Phe Leu Thr Arg Gin Ala Trp Pro Leu Gly Gin Ala Gly Cys Lys Ala 
85 90 95 

Val Tyr Tyr Val. Cys Ala Leu Ser Met Tyr Ala Ser Val Leu Leu Thr 
100 105 . no 

r Gly Leu Leu Ser Leu Gin Arg Cys Leu Ala Val Thr Arg Pro Phe Leu 
115 120 ' 125 

Ala Pro Arg Leu Arg Ser Pro Ala Leu Ala Arg Arg Leu Leu Leu Ala 
130 135 140 

Val Trp Leu Ala Ala Leu Leu Leu Ala Val Pro Ala Ala Val Tyr Arg 
145 150 155 160 

His Leu Trp Arg Asp Arg Val Cys Gin Leu Cys His Pro Ser Pro Val 
165 170 175 : 

His Ala Ala Ala His Leu Ser Leu Glu Thr Leu Thr Ala Phe Val Leu 
180 185 190 

Pro Phe Gly Leu Met Leu Gly Cys Tyr Ser Val Thr Leu Ala Arg Leu 
195 200 205 

Arg Gly Ala Arg Trp Gly Ser Gly Arg His Gly Ala Arg Val Gly Arg 
210 215 . 220 

Leu Val Ser Ala lie Val Leu Ala Phe Gly Leu Leu Trp Ala Pro Tyr 
225 230 235 240 

His Ala Val Asn Leu Leu Gin Ala Val Ala Ala Leu Ala Pro Pro Glu 
245 250 255 

Gly Ala Leu Ala Lys Leu Gly Gly Ala Gly Gin Ala Ala Arg Ala Gly 
260 265 270 

Thr Thr Ala Leu Ala Phe Phe Ser Ser Ser Val Asn Pro Val Leu Tyr 
275 280 285 

Val Phe Thr Ala Gly Asp Leu Leu Pro Arg Ala Gly Pro Arg Phe Leu 
290 295 ; 300 

Thr Arg Leu Phe Glu Gly Ser Gly Glu Ala Arg Gly Gly Gly Arg Ser 

310 315 . 320 



^ '" ''^ ' ' ^ ^ ; 'A ; /l^-'—''*' PCTWS99n3M " V A" 

' ..^ :> - •: '.; I Arg Glu piy Thr Met Glu Leu Arg Thr Thr Pro Gin .Leu Lys ! Var Val V * "A V: 



325 330 ■ . ; 33s 

3 « ... ^ - ' 350 



■ Gly Gin Gly Arg Gly Asn Gly Asp Pro Gly Gly Gly Met Glu Lys. Asp 



10 



. Gly Pro Glu Trp Asp Leu ' *"* " ~ -t r ~ • ••- 

.W *: , ■■■ 355 • "... '...:'"".* A ;i ■ 

<36) INFORMATION FOR SEQ ID NO :35 : ' ■" ~* ■"■ " .' ' ' : ~ ^ v r 

(i) SEQUENCE CHARACTERISTICS: , ; >V '-V* ; V 'V 

(A) LENGTH: 1005 base pairs .V • " '^V':':' 0 : 

. (B) TYPE: nucleic acid : . "■. 7 \. 

O -' (C) STRANDEDNESS : single i'-T 

A'^.w '/'"..' (D) TOPOLOGY : linear ■ V' ' / ' ?: ' 

: • . MOLECULE. TYPE: DNA (genomic) ^VW^'^^ ;;V ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: ' . " 
' . 15ATGCTGGGGA TCATGGCATG GAATGCAACT TGCAAAAACT GGCTGGCAGC AGAGGCTGCC SO 
'" ' : : CTGGAAAAG? - ACTACCTTTC CATTTTTTAT GGGATTGAGT TCGTTGTGGG AGTCCTTGGA 120 
:^ AATACCATTG TTGTTTACGG CTACATCTTC TCTCTGAAGA ACTGGAACAG CAGTAATATT 18o' 
; ; TATCTCTTTA ACCTCTCTGT CTCTGACTTA G CTTTTCTGT GCACCCTCCC CATGCTGATA 240 

;AGGAGTTATG CCAATGGAAA CTGGATATAT GGAGACGTGC TCTGCATAAG CAACCGATAT' 300 • 
■ . 20 GTGCTTCATG CCAACCTCTA TACCAGCATT CTCTTTCTCA CTTTTATCAG CATAGATCGA 360 
... TACTTGATAA TTAAGTATCC TTTCCGAGAA CACCTTCTGC • AAAAGAAAGA GTTTGCTATT 420 
. . . ; TTAATCTCCT TGGCCATTTG GGTTTTAGTA ACCTTAGAGT TACTACCCAT ACTTCCCCTT 480 
ATAAATCCTG TTATAACTGA CAATGGCACC ACCTGTAATG ATTTTGCAAG TTCTGGAGAC 540 
; CCCAACTACA ACCTCATTTA CAGCATGTGT .CTAACACTGT TGGGGTTCCT TATTCCTCTT 600 
25 TTTGTGATGT GTTTCTTTTA TTACAAGATT GCTCTCTTCC TAAAGCAGAG GAATAGGCAG 
• GTTGCTACTG CTCTGCCCCT TGAAAAGCCT CTCAACTTGG TCATCATGGC AGTGGTAATC 720 
TTCTCTGTGC TTTTTACACC CTATCACGTC ATGCGGAATG TGAGGATCGC TTCACGCCTG 780 ' 
. GGGAGTTGGA AGCAGTATCA GTGCACTCAG GTCGTCATCA ACTCCTTTTA CATTGTGACA 840 J 
CGGCCTTTGG ' CCTTTCTGAA CAGTGTCATC AACCCTGTCT TCTATTTTCT TTTGGGAGAT 900 
30 CACTTCAGGG ACATGCTGAT GAATCAACTG AGACACAACT, TCAAATCCCT . TACATCCTTT 960 - 
AGCAGATGGG CTCATGAACT CCTACTTTCA TTCAGAGAAA AGTGA 1005 
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(37) INFORMATION FOR SEQ " ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 

( C ) S TRANDEDNESS : 

(P) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xij SEQUENCE DESCRIPTION: SEQ IDNO:36: 

Met Leu Gly He Met Ala Trp Asn Ala Thr Cys Lys As n Trp Leu Ala 
* 5 10 , , is 

Ala Glu Ala Ala Leu Glu Lys Tyr Tyr Leu Ser He Phe Tyr Gly He 
20 25 3o 

Glu . Phe Val Val Gly Val Leu Gly Asn Thr He Val Val Tyr Gly Tyr 
35 40 45 

He Phe Ser Leu Lys Asn Trp Asn Ser Ser Asn lie Tyr Leu Phe Asn' 
50 55 60 



Leu Ser Val Ser Asp Leu Ala Phe Leu Cys Thr Leu Pro Met Leu He 
65 7 ° 75 80 

Arg Ser Tyr Ala Asn Gly Asn Trp He Tyr Gly Asp Val Leu Cys lie 
85 90 9 5 

Ser Asn Arg Tyr Val Leu His Ala Asn Leu Tyr Thr Ser He Leu Phe 
• 100 105 iio 

Leu Thr Phe lie Ser lie Asp Arg Tyr Leu He He Lys Tyr Pro Phe 
US 120 . 125 

Arg Glu His Leu Leu Gin Lys Lys Glu Phe Ala lie Leu He Ser Leu 
130 135 140 

Ala lie Trp Val Leu Val Thr Leu Glu Leu Leu Pro He Leu Pro Leu 

" 5 150 155 " 160 

He Asn Pro Val He Thr Asp Asn Gly Thr Thr Cys Asn Asp Phe Ala 
165 170 175 

Ser Ser Gly Asp Pro Asn Tyr Asn Leu He Tyr Ser Met Cys Leu Thr 
180 ; 185 190 

Leu Leu Gly Phe Leu He Pro Leu Phe Val Met Cys Phe Phe Tyr Tyr 
' •'" 195 200 205 ■ 

Lys tie Ala Leu Phe Leu Lys Gin Arg Ash Arg Gin Val Ala Thr Ala 
210 ' .215 ; 220. - 
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• - .. . ■ " , .1 .. : " 5 ' ' ■ ' . ■ ' 240 ' 

:^ S " V ". ^ HI - v«i ^ n« 

Ala Ar g Gl y 5eE Trp .ys GW Tyr ck c,, ^ Gln vai- 

; . . ■;, .;■ 265 . ■ - \ 270 

He Asn.Ser Phe Tyr He var tv, * „" 

275 1 Pro Leu Phe Leu Asn Ser 

Val lie Asn Pro Val Phe ivr dv,« , 
10 . 290 : . P ' ^ Phe Leu Gly Asp His Phe.Arg Asp • V 

: 300 ;-:.V- ' 

Met Leu Met Asn Gin Leu Aro Hi,, V „. 

305 . Z?" Ar9 Hls Asn Phe Lys Ser Leu Thr Ser Phe 

' • 315 

: Ser AT, Trp Ma „i 3 Glu teu ^ ^ - ^ 

• . J25 330 . ■■■■■■ 

15(38) INFORMATION FOR SEQ ID NO :37s 

(i) SEQUENCE CHARACTERISTICS' 
■ . (A) LENGTH: 1296 base pairs •• . 

(B) TYPE: nucleic acid - 
- <C) STRANDEDNESS: single' ' 
■ (D) TOPOLOGY: linear 

■■: (ii) MOLECULE TYPE: DMA (genomic) .. 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:37 : ' "/ 
axgcaggcgc ^ACATTAC CCCGGAGCAG 

ACGCIJCGMC TCTGTACCGG CTGCGACCGC TCGTCTACAC CCCAGAG™ :20 

25CGGGGACGCG CCAAGCGGC CCTCOTGCTC ACCGGCGTGC TCATCTTCGC GCTGGCGCTC l80 
TTTGGCAATG CTCTGGTGTT CTACGTGGTG ACCCGCAGCA AGGCCATGCG CACCGTCACC 24o' 
AACATCTTTA TCTGCTCCTT GGCGCTCAGT GACCTGCTCA ,CACCT, OT CTGCATTCCC ,00 
: GTCACCATGC TCCAGAACAT 'TTCCGACAAC TGGCTGGGGG GTGCTTTCAT TTGCAAGATG 3,0 
GTGCCATTTC TCCAQTCTAC CGCTGTTGTG .ACAGAAATGC TCACTATGAC CTGCATTGCT 4 2 „ 
30GTGGAAAGGC ACCAGGGACT' TGTGCATCCT. TTTAAAATGA AGTGGCAATA CACCAACCGA 480 
AGGGCTTTCA CAATGCTAGG' TGTGGTCTGG G^GGAG TCATCGTAGG A T GACGCA TC 540 ■ 
TGGCACGTGC AACAACT^A GATCAAATAT GACTTCCTAT ATCAAAAGGA ACACA^GC 6 00 
. TGCTTAGAAG AGTGGACCAG CCCTGXGCAC CAGAAGATCT ACACCACCTt' CATCCTTCTC «0 
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ATCCTCTTCC TCCTGCCTCT TATGGTGATG . CTTATTCTGT ACAG TAAAAT TGGTTATGAA 720 

CTTTGGATAA AGAAAAGAGT TGGGGATGGT TCAGTGCTTC GAACTATTCA TGGAAAAGAA 780 

ATGTCCAAAA TAGCCAGGAA GAAGAAACGA. GCTGTCATTA TGATGGTGAC AGTGGTGGCT 840 

.. CTCTTTGCTG TGTGCTGGGC ACCATTCCAT GTTGTCCATA TGATGATTGA ATACAGTAAT 900 

5 TTTGAAAAGG AATATGATGA TGTCACAATC AAGATGATTT TTGCTATCGT GCAAATTATT 960 

GGATTTTCCA ACTCCATCTG TAATCCCATT GTCTATGCAT TTATGAATGA AAACTTCAAA1 020 

AAAAATGTTT TGTCTGCAGT TTGTTATTGC ATAGTAAATA AAACCTTCTC TCCAGCACAA1080 

AGGCATGGAA ATTCAGGAAT TAGAATGATG CGGAAGAAAG CAAAGTTTTC CCTCAGAGAG1140 

AATCCAGTGG AGGAAACCAA AGGAGAAGCA TTCAGTGATG GCAACATTGA AGTCAAATTG12 0 0 

] 0 TGTGAACAGA CAGAGGAGAA GAAAAAGCTC AAACGACATC TTGCTCTCTT TAGGTCTGAA1260 

CTGGCTGAGA ATTCTCCTTT AGACAGTGGG CATTAA 1296 

(39) INFORMATION FOR SEQ ID NO:38: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 431 amino acids 
!5 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



20 



Met Gin Ala Leu Asn He Thr Pro Glu Gin Phe Ser Arg Leu Leu Arg 
1 5 10 15 

Asp His Asn Leu Thr Arg Glu Gin Phe lie Ala Leu Tyr Arg Leu Arq 

20 . - 25 30 - 

Pro Leu Val Tyr Thr Pro Glu Leu Pro Gly Arg. Ala Lys Leu Ala Leu 
25 35 40 45 

Val Leu Thr Gly Val Leu lie" Phe Ala Leu Ala Leu Phe Gly Asn Ala 
50 55 • 60 

Leu Val Phe Tyr Val Val Thr Arg Ser Lys Ala Met Arg Thr Val Thr 
65 70 . 75 80 

30 . Asn lie Phe He Cys Ser Leu Ala Leu Ser Asp Leu Leu He Thr Phe 

' 85 , . 90 95 

Phe Cys He Pro Val Thr Met Leu Gin Asn lie Ser Asp Asn Trp Leu 



,::-f.-.,0-^:^v : • PCT7US99/23687 ; 

•■' ' • ' ' C -48- ■ ' • . - ' ■ \- 

100 • : / ? ibs v "' , ' : v 110 . ; 

' Gly Gly Ala Phe lie Cys Lys Met Val Pro, Phe Val Gin 



10 



20 



25 



30 



35 



115 



120 



Ser Thr Ala 



,125- 



Val val Thr Glu Met Leu Thr Met Thr Cys lie Ala Val Glu Arg His 



135 



140 



Gin Gly Leu Val His Pro Phe Lys Met.Lys.Trp Gin Tyr Thr Asn Arg 



155 



160 



Arg Ala Phe Thr Met Leu Gly Val Val Trp Leu Val Ala Val lie Val 
165 170 175 



Gly Ser Pro Met Trp His Val Gin Gin Leu Glu lie Lys Tyr 
180 7 y 



185 



Asp Phe 



190 



Leu Tyr Glu Lys Glu- His lie Cys Cys Leu Glu Glu Trp Thr Ser Pro 
'.: 200 ' . 205. ' 

15 . - ^ *** ^ ^ Phe 116 Leu Val »• Leu Phe Leu 



215 



220 



Leu Pro Leu Met Val Met Leu lie Leu Tyr Ser Lys. lie Gly Tyr Glu 

230 :" . "5 ■ ; . . - •;■ 240 

Leu Trp lie Lys Lys Arg Val Gly Asp Gly. Ser Val Leu 



245 



250 



Arg Thr lie 



255 



His Gly Lys Glu Met Ser Lys lie Ala Arg Lys Lys Lys Arg Ala 



265 



Val 



270 

He Met Met Val Thr Val Val Ala Leu Phe Ala Val Cys Trp Ala Pro 

; 280 285 

Phe His Val val- His Met Met lie Glu Tyr Ser Asn Phe Glu Lys Glu 



295 



300 



Tyr Asp Asp Val Thr lie Lys Met lie Phe Ala lie Val Gin lie He 



310 



315 



320 



Gly Phe Ser Asn Ser lie Cys Asn Pro lie Val Tyr Ala Phe Met Asn 

325 330 335 

Glu Asn Phe Lys Lys Asn Val Leu Ser Ala Val Cys Tyr ci 



• 3 * ' "* *~ u . ocr va -L cys Tyr Cys lie Val 

345 350 

Asn Lys Thr Phe Ser Pro Ala Gin Arg His Gly Asn Ser Gly lie Thr 

3 " •; 360 365 - ' _ 7 

Met Met.Arg Lys Lys. Ala Lys Phe Ser Leu Arg Glu Asn Pro Val Glu 

375 380 
Glu Thr Lys Gly Glu Ala Phe Ser Asp Gly Ash lie Glu Val -Lys Leu 



,J395_ 



400 
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Cys-Glu Gin Thr Glu Glu Lys Lys Lys Leu Lys Arg His Leu Ala Leu 
405 410 415 

Phe Arg Ser Glu Leu Ala Glu "Asn Ser Pro Leu Asp Ser Gly His 
420 425 J 430 

5 (40) INFORMATION FOR SEQ ID NO: 39: : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear •' 

. (ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:. 
CTGTGTACAG CAGTTCGCAG AGTG 

(41) INFORMATION FOR SEQ ID NO: 40: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 /(ii) MOLECULE TYPE: DNA (genomic) 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
. GAGTGCCAGG CAGAGCAGGT AG AC ' 

(42) INFORMATION FOR SEQ ID NO:41: 

(i) SEQUENCE - CHARACTERISTICS : 
25 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear ' ' : 

(ii) MOLECULE TYPE: DNA (genomic) • 
30 (ivj T^NTI- SENSE : NO ' ^ * 



24 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
CCCGAATTCC TGCTTGCTCC CAGCTTGGCC C - 
(43) INFORMATION FOR SEQ ID NO:42: 



31 
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" Hi SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 -base pairs 
"(B) TYPE: nucleic acid 
' ' ^ • ' ■ (C) STRANDEDNESS : single 
" : "(D) "TOPOLOGY :~ 1 inear ~ ' " T 

; 4 i} . M0L . ECULE TYPE: DNA (genomic) 
; (iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
TGTGGATCCT GCTGTCAAAG GTCCCATTCC GG 
10 (44) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: : ' 

(A) LENGTH: 20 base pairs 

(B) . TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
TCACAATGCT AGGTGTGGTC 



(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES - 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
TGCATAGACA ATGGGATTAC AG 



20 



(45) INFORMATION FOR SEQ ID NO:44: 
- (i) SEQUENCE CHARACTERISTICS : 




(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 511 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single. 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:' 

TCACAATGCT AGGTGTGGTC TGGCTGGTGG CAGTCATCGT AGGATCACCC ATGTGGCACG 60 

5 TGCAACAACT TGAGATCAAA TATGACTTCC TATATGAAAA GGAACACATC TGCTGCTTAG 120 

AAGAGTGGAC CAGCCCTGTG CACCAGAAGA TCTACACCAC CTTCATCCTT GTCATCCTCT 180 

TCCTCCTGCC TCTTATGGTG ATGCTTATTC TGTACGTAAA ATTGGTTATG AACTTTGGAT 240 

AAAGAAAAGA GTTGGGGATG GTTCAGTGCT TCGAACTATT CATGGAAAAG AAATGTCCAA 300 " 

AATAGCCAGG AAGAAGAAAC GAGCTGTCftT " TATGATGGTG ACAGTGGTGG CTCTCTTTGC 360 

10 TGTGTGCTGG GCACCATTCC ATGTTGTCCA TATGATGATT GAATACAGTA ATTTTGAAAA 420. 

GGAATATGAT GATGTCACAA TCAAGATGAT TTTTGCTATC GTGCAAATTA TTGGATTTTC 480 

CAACTCCATC TGTAATCCCA TTGTCTATGC A . , . - 511 

(4 7) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH : 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single ' " 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 (iv) ANTI-SENSE: NO 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CTGCTTAGAA GAGTGGACCA G 
(48) INFORMATION FOR SEQ ID NO:47: 



21 



25 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
30 (iv) ANTI- SENSE: NO 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47: 

CTGTGCACCA GAAGATCTAC AC ; 

(49) INFORMATION FOR SEQ ID. NO: 48: 

(i) SEQUENCE CHARACTERISTICS: ' 
5 • ' . ' ■ ■ (A) LENGTH: 21 base pairs • 7 " 
(B) TYPE: nucleic acid 
■ ;' . (C) STRANDEDNESS : single r \\\ 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iv) ANTI-SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: CEQIDNO:48: 

CAAGGATGAA GGTGGTGTAG A 

(50) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS:. 
, (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 



21 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GTGTAGATCT TCTGGTGCAC AGG 
( 51) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS:' single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GCAATGCAGG TCATAGTGAG C 
(52) INFORMATION FOR SEQ ID NO: 51: *V 
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- 53 - 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51 
10TGGAGCATGG TGACGGGAAT GCAGAAG 

(53) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
15.. (C) STRANDEDNESS: single 

/(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ * ID NO: 52: 
20 GTGATGAGCA GGTCACTGAG CGCCAAG 

(54) INFORMATION FOR SEQ . ID NO: 53: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
..(C). STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

30 GCAATGCAGG CGCTTAACAT TAC 

(55) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: . 
(A) LENGTH: 22 base pairs 



(i) 



5 



(i) 

25 
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(ii) 



MOLECULE TYPE: DNA (genomic) 



(iv) 



ANTI -SENSE : YES ' " 7'" 




. ; (xi) 



SEQUENCE DESCRIPTION/ SEQ ID NO: 54 : 



-TTGGGTTACA ATCTGAAGGG CA 
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( 56 ) INFORMATION FOR SEQ ID NO : 55 : 



; (i) SEQUENCE CHARACTERISTICS: 
10 : (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid - 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55- 
ACTCCGTGTC CAGCAGGACT CTG 
(57) INFORMATION FOR SEQ ID NO: 56 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs •.. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 : 
TGCGTGTTCC TGGACCCTCA CGTG . ' 

(58) INFORMATION FOR SEQ ID NO: 57: 

(i) , SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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-55- 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57 
CAGGCCTTGG ATTTTAATGT CAGGGATGG 
5 (59) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
GGAGAGTCAG CTCTGAAAGA ATTCAGG 
15 (60) INFORMATION FOR SEQ ID NO:59; 

■: "(i) ' SEQUENCE CHARACTERISTICS:' 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 
TGATGTGATG CCAGATACTA ATAGCAC 
25 (61) INFORMATION FOR SEQ ID NO:60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iv) ANTI-SENSE: YES 



WO 00/31258 , : ' ■ " /^V ^fe^ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 : ,V; • ^ " : " 
CCTGATTCAT TTAGGTGAGA TTGAGAC * 
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(62) INFORMATION FOR SEQ ID NO: 61 : 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 22 base pairs 

f . (B) TYPE: nucleic acid - 

(G) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

\ (ii) MOLECULE TYPE: DNA (genomic) ; 
10 (iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
GACAGGTACC TTGCCATCAA G 
(63) INFORMATION FOR SEQ ID NO:62: 



15 



V (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 (iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 

CTGCACAATG CCAGTGATAA GG 

(64) INFORMATION FOR SEQ ID NO: 63 : 

, (i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 27 base pairs 

- (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 
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; (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: 
CTGACTTCTT GTTCCTGGCA GCAGCGG 



27 



27 
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(65) INFORMATION FOR SEQ ID NO:64: - 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid : 
5 (C) STRANDEDNESS : single 

-(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
10AGACCAGCCA GGGCACGCTG AAGAGTG 

(66) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) • 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:65: 
20GATCAAGCTT CCATCCTACT GAAACCATGG TC 

(67) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 
30 GATCAGATCT CAGTTCCAAT ATTCACACCA CCGTC 

(68) INFORMATION FOR SEQ ID NO: 67: 
(i) SEQUENCE CHARACTERISTICS: ' " * ' ' 
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58 



. V (A) LENGTH: 22. base pairs ' 

. : (B) TYPE: nucleic acid 

■ ; ; (C) STRANDEDNESS: single - 

v , (D) TOPOLOGY: linear 

5^- - ( i i ) "MOLECULE TYPE : r DNaT (g'en^ic)* 

(iv) ANTI-SENSE : ~. NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 : ... 
CTGGTGTGCT CCATGGCATC CC ; 
(69) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 
■ \ (A) LENGTH: 22 base pairs 
• (B) TYPE: nucleic acid 

(Q) S TRANDEDNES S : single-. 
(D) TOPOLOGY: linear " 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES ; 
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(xi) SEQUENCE DESCRIPTION: SEQ ID.NO:68: 
GTAAGCCTCC CAGAACGAGA GG 
(70) INFORMATION FOR SEQ ID NO : 69: 

(i) SEQUENCE CHARACTERISTICS: .- \ 

(A) LENGTH: 24 base pairs . -\ " ".:/•; 

(B) TYPE: nucleic acid -\ 
. < ~ (C) STRANDEDNESS : " single 

(D) . TOPOLOGY: linear ; 

(ii) MOLECULE TYPE: DNA (genomic) ' 
(iv) ANTI-SENSE: NO 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID ' NO: 69 : 
CAGCGCAGGG TGAAGCCTGA GAGC J 
(71) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs J : • : 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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"■■/-V; ' " -59- 
(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: • 

GGCACCTGCT GTGACCTGTG CAGG ^ • 
24 

5 (72) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 22 base pairs • , 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single ■ 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 
GTCCTGCCAC TTCGAGACAT GG „ 
15 (73) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid * .- 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES . v 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GAAACTTCTC TGCCCTTACC GTC 
25 (74) INFORMATION FOR SEQ ID NO:73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: NO 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 27 base /pairs "- ■'■ 

(B) TYPE : t nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA. (genomic) ' 
. ; (iv) ANTI- SENSE: YES 



Axi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GGAGAGTCAG CTCTGAAAGA ATTCAGG 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-4 ; 

Human G protein-coupled receptor as characterized by 
SEQ.ID.2, a cDNA encoding said receptor as characterized by 
SEQ.ID.l, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



2. Claims: 5-8 

Human G protein-coupled receptor as characterized by 
SEQ.ID.4, a cDNA encoding said receptor as characterized by 
SEQ.ID.3, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



3. Claims: 9-12 

Human G protein-coupled receptor as characterized by 
SEQ.ID.6, a cDNA encoding said receptor as characterized by 
SEQ.ID.5, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



4. Claims: 13-16 

Human G protein-coupled receptor as characterized by 
SEQ.ID.8, a cDNA encoding said receptor as characterized by 
SEQ.I0.7, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



5. Claims: 17-20 

Human G protein-coupled receptor as characterized by 
SEQ. ID. 10, a cDNA encoding said receptor as characterized by 
SEQ.ID.9, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



6. Claims: 21-24 . 

Human G protein-coupled receptor as characterized by 
SEQ.ID.12, a cDNA encoding said receptor as characterized by 
SEQ.ID.ll, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 
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8. Claims: 29-32 7 

Human G protein-coupled receptor as characterized by 
SEQ. ID. 16, a cDNA encoding said receptor as characterized by 
SEQ.ID.15, a plasmid comprising said cDNA, and a host cell 
comprising said plasmld. 

9. Claims: 33-36 

Human G protein-coupled receptor as characterized by 
SEQ. ID. 18, a cDNA encoding said receptor as characterized by 
SEQ.ID.17, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



10. Claims: 37-40 

Human G protein-coupled receptor as characterized by 
SEQ.ID.20, a cDNA encoding said receptor as characterized by 
SEQ. ID. 19, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



11. Claims: 41-44 

Human G protein-coupled receptor as characterized by 
SEQ. ID. 22, a cDNA encoding said receptor as characterized by 
SEQ. ID. 21, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 



12. Claims: 45-48 

Human G protein-coupled receptor as characterized by 
SEQ. ID. 24, a cDNA encoding said receptor as characterized by 
SEQ. ID. 23, a plasmid comprising said cDNA,and a host cell 
comprising said plasmid. 



13. Claims: 49-52 

Human G protein-coupled receptor as characterized by 
SEQ._ID.26, a cDNA encoding said receptor as characterized by 
SEQ. ID. 25, a plasmid comprising said cDNA, and a host cell 
comprising said plasmid. 
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SEQ.ID.28, a cDNA encoding said receptor as characterized by 

SEQ.ID.27, a plasmid comprising said cDNA, and a host cell 

comprising said plasmid. . ' 

15. Claims: 57-60 ; r 

Human G protein-coupled receptor as characterized by 

SEQ.ID.30, a cDNA encoding said receptor as characterized by. 

SEQ.ID.29, a plasmid comprising said cDNA, and a host cell 

comprising said plasmid. 

16. Claims: 61-64 \ , : :<'^- r * 

Human G protein-coupled receptor as characterized by . 

~SEQ.ID.32, a cDNA encoding said receptor as characterized by 

SEQ.ID.31, a plasmid comprising said cDNA, and a. host cell 

comprising said plasmid. 



17. Claims: 65-68 ' . . . ' 

Human G protein-coupled receptor as characterized by 
SEQ.ID.34, a cDNA encoding said receptor as characterized by 
SEQ.ID.33, a plasmid comprising said cDNA, and a host celt ■ 
comprising said plasmid. 

18. Claims: 69-72 

Human G protein-coupled receptor as characterized by 
SEQ.ID.36, a cDNA encoding said receptor as characterized by 
SEQ.ID.35, a plasmid comprising said cDNA, and a host cell 
comprising . said plasmid. 



19. Claims: 73-76 

Human G protein-coupled receptor as characterized by 
SEQ.ID.38, a cDNA encoding said receptor as characterized by 
• SEQ.ID.37, a plasmid comprising said cDNA, and a host cell , 
comprising said plasmid. 
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